Embeddings API
light_curve.embed.EmbeddingSession
Bases: abc.ABC
Abstract base for ONNX-backed embedding models.
Subclasses implement :meth:preprocess_lc (convert raw arrays to model
tensors) and :meth:predict_tensors (run the session and return embeddings).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
session
|
InferenceSession
|
An |
required |
reduction
|
str, list of str, or Reduction
|
Strategy for mapping variable-length light curves to fixed-length sequences. |
required |
reduction_kwargs
|
dict
|
Extra keyword arguments forwarded to :func: |
None
|
from_hf
classmethod
Load a model from the HuggingFace Hub.
Downloads (and caches) the ONNX model file, creates an
onnxruntime.InferenceSession, and returns a ready-to-use instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
str
|
Path to the model file inside |
required |
ort_session_kwargs
|
dict[str, object] | None
|
Options to pass to the |
None
|
**kwargs
|
Forwarded verbatim to the class constructor |
required |
Returns:
| Type | Description |
|---|---|
instance of the calling class
|
Instance with a live ONNX inference session. |
Raises:
| Type | Description |
|---|---|
ImportError
|
If |
ImportError
|
If no |
predict_tensors
Run the ONNX session on pre-processed tensors and return embeddings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tensors
|
InputTensors
|
Pre-processed model inputs as returned by :meth: |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Embedding array with shape depending on the model and time reduction. |
preprocess_lc
Convert raw light curve arrays to model input tensors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*arrays
|
array - like
|
Raw light curve arrays (e.g. time, magnitude). |
required |
Returns:
| Type | Description |
|---|---|
InputTensors
|
Tensors ready to be passed to :meth: |
light_curve.embed.SingleBandModel
Bases: light_curve.embed.model.EmbeddingSession, abc.ABC
Embedding model that processes one photometric band at a time.
When bands is None the full light curve is treated as a single band.
When bands is provided, the light curve is split by band label, each band
is embedded independently, and the results are concatenated along
:attr:Dim.BAND.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
session
|
InferenceSession
|
ONNX inference session. |
required |
bands
|
sequence of str or int
|
Ordered band labels to embed. |
None
|
reduction
|
str, list of str, or Reduction
|
Windowing / subsampling strategy. Defaults to
|
'non-overlapping-windows'
|
reduction_kwargs
|
dict
|
Extra kwargs forwarded to :func: |
None
|
light_curve.embed.Astromer1
Bases: light_curve.embed.astromer._AstromerModel
Astromer 1 embedding model.
Transformer encoder pretrained on MACHO R-band light curves via masked magnitude prediction. Accepts single-band photometry and returns a 256-dimensional embedding (2 layers, 4 attention heads).
The ONNX model is hosted on HuggingFace at
https://huggingface.co/light-curve/astromer1 (astromer1.onnx).
Three named outputs are available; select with the output parameter:
"mean"(default) — masked mean pooling → shape(batch, 256)"max"— masked max pooling → shape(batch, 256)"sequence"— per-timestep features → shape(batch, 200, 256)
Use :meth:from_hf to download and load the model directly.
Model license
MIT.
References
Donoso-Oliva et al. (2023), ASTROMER: A transformer-based embedding for the representation of light curves, A&A 670, A54. https://ui.adsabs.harvard.edu/abs/2023A%26A...670A..54D/abstract
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
session
|
ONNX inference session for the Astromer 1 model file. |
required | |
output
|
str
|
Which named output to return: |
'mean'
|
bands
|
sequence of str or int
|
Band labels. |
None
|
reduction
|
str, list of str, or Reduction
|
Windowing strategy. Defaults to :class: |
'non-overlapping-windows'
|
hf_filename = 'astromer1.onnx'
class-attribute
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
hf_repo = 'light-curve/astromer1'
class-attribute
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
light_curve.embed.Astromer2
Bases: light_curve.embed.astromer._AstromerModel
Astromer 2 embedding model.
Pretrained on 1.5 million MACHO light curves. Accepts single-band photometry and returns a 256-dimensional embedding.
The ONNX model is hosted on HuggingFace at
https://huggingface.co/light-curve/astromer2 (astromer2.onnx).
Three named outputs are available; select with the output parameter:
"mean"(default) — masked mean pooling → shape(batch, 256)"max"— masked max pooling → shape(batch, 256)"sequence"— per-timestep features → shape(batch, 200, 256)
Use :meth:from_hf to download and load the model directly.
Model license
MIT.
References
Donoso-Oliva et al. (2026), Generalizing across astronomical surveys: Few-shot light curve classification with Astromer 2, A&A 707, A170. https://ui.adsabs.harvard.edu/abs/2026A%26A...707A.170D/abstract
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
session
|
ONNX inference session for the Astromer 2 model file. |
required | |
output
|
str
|
Which named output to return: |
'mean'
|
bands
|
sequence of str or int
|
Band labels. |
None
|
reduction
|
str, list of str, or Reduction
|
Windowing strategy. Defaults to :class: |
'non-overlapping-windows'
|
hf_filename = 'astromer2.onnx'
class-attribute
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
hf_repo = 'light-curve/astromer2'
class-attribute
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
light_curve.embed.ATCAT
Bases: light_curve.embed.model.ExplicitMultiBandModel
ATCAT multiband transformer embedding model.
ATCAT (Astronomical Transformer for Classification and Analysis of Transients) is a transformer-based model trained on LSST-like multiband light curves. It accepts flux, flux-error, time, and integer channel-index arrays and produces dense embeddings.
The model expects fluxes calibrated to AB zero-point 27.5 (ELAsTiCC / SNANA
FITS convention). Use mag_zp to convert from a different zero-point at
call time — common values are 31.4 (LSST nJy) and 8.9 (Jy).
Valid model band indices are 0–5, corresponding to LSST u g r i z Y.
Pass a band_groups dict (e.g. {"u": 0, "g": 1, ...}) to use string
band labels instead of integers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
session
|
InferenceSession
|
An |
required |
output
|
(last, mean, sequence)
|
Which model head to return:
|
"last"
|
band_groups
|
Mapping, list of Mapping, or None
|
Band label → model integer mapping(s). See
:class: |
None
|
allow_extra_bands
|
bool
|
If |
False
|
reduction
|
str, list of str, or Reduction
|
Windowing / subsampling strategy. Defaults to
|
'non-overlapping-windows'
|
reduction_kwargs
|
dict
|
Extra keyword arguments forwarded to :func: |
None
|
mag_zp
|
float
|
AB zero-point of the input fluxes. Fluxes are rescaled to ZP = 27.5 (ELAsTiCC / SNANA FITS convention) before inference. Common values: 31.4 (LSST nJy, default), 27.5 (no rescaling needed), 8.9 (Jy). |
31.4
|
Examples:
>>> import numpy as np
>>> import light_curve.embed as lce
>>> model = lce.ATCAT.from_hf(
... output="last",
... band_groups={"u": 0, "g": 1, "r": 2, "i": 3, "z": 4, "Y": 5},
... )
>>> time = np.linspace(0, 200, 100, dtype=np.float32)
>>> flux = np.ones(100, dtype=np.float32)
>>> flux_err = np.full(100, 0.1, dtype=np.float32)
>>> band = np.array(["g", "r"] * 50)
>>> embedding = model(time, flux, flux_err, band)
>>> embedding.shape
(1, 1, 1, 384)
Model license
Modified MIT with a non-military-use restriction (upstream ATCAT license).
References
Tung (2025), ATCAT: Astronomical Timeseries CAusal Transformer, arXiv:2511.00614. https://ui.adsabs.harvard.edu/abs/2025arXiv251100614T/abstract
hf_filename = 'atcat.onnx'
class-attribute
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
hf_repo = 'light-curve/atcat'
class-attribute
str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.str() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'.
model_outputs = frozenset({'mean', 'last', 'sequence'})
class-attribute
frozenset() -> empty frozenset object frozenset(iterable) -> frozenset object
Build an immutable unordered collection of unique elements.
seq_size = 243
class-attribute
int([x]) -> integer int(x, base=10) -> integer
Convert a number or string to an integer, or return 0 if no arguments are given. If x is a number, return x.int(). For floating-point numbers, this truncates towards zero.
If x is not a number or if base is given, then x must be a string, bytes, or bytearray instance representing an integer literal in the given base. The literal can be preceded by '+' or '-' and be surrounded by whitespace. The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to interpret the base from the string as an integer literal.
int('0b100', base=0) 4
valid_model_bands = frozenset({0, 1, 2, 3, 4, 5})
class-attribute
frozenset() -> empty frozenset object frozenset(iterable) -> frozenset object
Build an immutable unordered collection of unique elements.
from_hf
classmethod
Load a model from the HuggingFace Hub.
Downloads (and caches) the ONNX model file, creates an
onnxruntime.InferenceSession, and returns a ready-to-use instance.
Only the requested output is computed at inference time — onnxruntime
prunes the unused computation graph automatically.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output
|
str
|
Named ONNX output to return. One of:
|
'last'
|
use_fp16
|
bool
|
Whether to load the model in float16 precision if supported.
Defaults to |
False
|
band_groups
|
sequence of str or int or None
|
Ordered band labels to embed. |
None
|
allow_extra_bands
|
bool
|
If |
False
|
reduction
|
str, list of str, or Reduction
|
Windowing / subsampling strategy. Defaults to
|
'non-overlapping-windows'
|
reduction_kwargs
|
dict or None
|
Extra keyword arguments forwarded to :func: |
None
|
mag_zp
|
float
|
AB zero-point of the input fluxes. Fluxes are rescaled to ZP = 27.5 (ELAsTiCC / SNANA FITS convention) before inference. Common values: 31.4 (LSST nJy, default), 27.5 (no rescaling needed), 8.9 (Jy). |
31.4
|
ort_session_kwargs
|
dict or None
|
Additional keyword arguments forwarded to |
None
|
Returns:
| Type | Description |
|---|---|
instance of the calling class
|
Instance with a live ONNX inference session. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
ImportError
|
If |
ImportError
|
If no |
predict_tensors
Run the ONNX model on pre-processed tensors and return reduced embeddings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tensors
|
ATCATInputs
|
As returned by :meth: |
required |
Returns:
| Type | Description |
|---|---|
np.ndarray, shape ``(n_subsamples, seq_size, embed_dim)``
|
Embeddings after applying the time reduction's aggregation.
For aggregated models (mean / last) |
preprocess_lc
Preprocess a light curve into Astromer model input tensors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
time
|
ArrayLike
|
Observation times in days (e.g. MJD). |
required |
flux
|
ArrayLike
|
AB-calibrated bandflux, zero-point is given by |
required |
flux_err
|
ArrayLike
|
Uncertainties on the fluxes, in the same units as |
required |
band
|
ArrayLike
|
Passband labels, 0,1,2,3,4,5 (LSST ugrizy). |
required |
Returns:
| Type | Description |
|---|---|
ATCATInputs
|
|
Reduction strategies
light_curve.embed.Beginning
Bases: light_curve.embed.reduction.SingleSubsampleReduction
Select the chronologically first seq_size observations of the light curve.
single_subsample_lc
Return the leading seq_size elements of each array.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*arrays
|
ndarray
|
1-D arrays of equal length. |
required |
seq_size
|
int
|
Number of observations to keep from the start. |
required |
Returns:
| Type | Description |
|---|---|
tuple of np.ndarray
|
First |
light_curve.embed.End
Bases: light_curve.embed.reduction.SingleSubsampleReduction
Select the chronologically last seq_size observations of the light curve.
single_subsample_lc
Return the trailing seq_size elements of each array.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*arrays
|
ndarray
|
1-D arrays of equal length. |
required |
seq_size
|
int
|
Number of observations to keep from the end. |
required |
Returns:
| Type | Description |
|---|---|
tuple of np.ndarray
|
Last |
light_curve.embed.RandomSubsample
Bases: light_curve.embed.reduction.SingleSubsampleReduction
Draw seq_size observations uniformly at random without replacement.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rng
|
int, np.random.Generator, or None
|
Seed or generator for reproducible sampling. |
required |
single_subsample_lc
Return a random subsample of at most seq_size observations, in original order.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*arrays
|
ndarray
|
1-D arrays of equal length. |
required |
seq_size
|
int
|
Maximum number of observations to sample. |
required |
Returns:
| Type | Description |
|---|---|
tuple of np.ndarray
|
|
light_curve.embed.NonOverlappingWindows
Bases: light_curve.embed.reduction.Reduction
Split the light curve into consecutive non-overlapping windows of seq_size observations.
A light curve of length L yields ceil(L / seq_size) windows; the last window
may be shorter than seq_size and is zero-padded. Per-window embeddings are
averaged to produce a single embedding per light curve.
reduce_embeddings
Reduce per-window embeddings to a single representation.
For aggregated outputs (output != "sequence") the window embeddings
are averaged, yielding shape (1, 1, embed_dim).
For output == "sequence" a masked mean is computed across windows
for each timestep position, yielding shape (1, seq_size, embed_dim)
regardless of how many windows the light curve was split into.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embeddings
|
np.ndarray, shape ``(n_windows, seq_size, embed_dim)``
|
Per-window embeddings. |
required |
tensors
|
InputTensors
|
Preprocessed input tensors; |
required |
output
|
str
|
Model output name. Determines aggregation strategy. |
required |
Returns:
| Type | Description |
|---|---|
np.ndarray, shape ``(1, 1, embed_dim)`` or ``(1, seq_size, embed_dim)``
|
For mean / max: mean over windows, shape |
subsample_lc
Yield consecutive slices of length seq_size.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*arrays
|
ndarray
|
1-D arrays of equal length. |
required |
seq_size
|
int
|
Window size. |
required |
Returns:
| Type | Description |
|---|---|
list of tuple of np.ndarray
|
|
light_curve.embed.MultipleReductions
Bases: light_curve.embed.reduction.Reduction
Apply several :class:SingleSubsampleReduction strategies in parallel.
Each strategy produces one window; embeddings are stacked along the subsample axis rather than aggregated, giving one embedding per strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reductions
|
list of SingleSubsampleReduction
|
Ordered list of strategies to apply. |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If any element of |
from_strings
classmethod
Construct from a list of strategy name strings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reductions
|
list of str
|
Strategy names recognised by :func: |
required |
**kwargs
|
Forwarded to each strategy constructor. If |
required |
Returns:
| Type | Description |
|---|---|
MultipleReductions
|
Instance wrapping the instantiated strategies. |
reduce_embeddings
Return embeddings unchanged — one per strategy, already stacked.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embeddings
|
array - like
|
Per-strategy embeddings from the model. |
required |
tensors
|
InputTensors
|
Unused; accepted for interface compatibility. |
required |
output
|
str
|
Unused; accepted for interface compatibility. |
required |
Returns:
| Type | Description |
|---|---|
array - like
|
The input unchanged. |
subsample_lc
Apply each strategy and return one window per strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*arrays
|
ndarray
|
1-D arrays of equal length. |
required |
seq_size
|
int
|
Maximum observations per window. |
required |
Returns:
| Type | Description |
|---|---|
list of tuple of np.ndarray
|
One element per strategy, each a tuple of subsampled arrays. |
light_curve.embed.SingleSubsampleReduction
Bases: light_curve.embed.reduction.Reduction, abc.ABC
Base for strategies that produce exactly one window per light curve.
reduce_embeddings
Return embeddings unchanged (single window — no aggregation needed).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embeddings
|
array - like
|
Per-window embeddings from the model. |
required |
tensors
|
InputTensors
|
Unused; accepted for interface compatibility. |
required |
output
|
str
|
Unused; accepted for interface compatibility. |
required |
Returns:
| Type | Description |
|---|---|
array - like
|
The input unchanged. |
single_subsample_lc
Return one subsampled window of at most seq_size observations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*arrays
|
ndarray
|
1-D arrays of equal length. |
required |
seq_size
|
int
|
Maximum number of observations to keep. |
required |
Returns:
| Type | Description |
|---|---|
tuple of np.ndarray
|
Subsampled arrays, each of length |
subsample_lc
Return a single-element list wrapping :meth:single_subsample_lc.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*arrays
|
ndarray
|
1-D arrays of equal length. |
required |
seq_size
|
int
|
Maximum observations per window. |
required |
Returns:
| Type | Description |
|---|---|
list of tuple of np.ndarray
|
A one-element list containing the subsampled arrays. |