Skip to content

Meta — Extractor and Bins

Extractor

Combines multiple feature extractors and evaluates them in a single pass.

light_curve.Extractor


Bins

light_curve.Bins

Bases: light_curve.light_curve_ext._FeatureEvaluator

Sampled time series meta-feature

Binning time series to bins with width \(\mathrm{window}\) with respect to some \(\mathrm{offset}\). \(j-th\) bin interval is \([j \cdot \mathrm{window} + \mathrm{offset}; (j + 1) \cdot \mathrm{window} + \mathrm{offset})\). Binned time series is defined by $$ t_j^ = (j + \frac12) \cdot \mathrm{window} + \mathrm{offset}, $$ $$ m_j^ = \frac{\sum{m_i / \delta_i^2}}{\sum{\delta_i^{-2}}}, $$ $$ \delta_j^* = \frac{N_j}{\sum{\delta_i^{-2}}}, $$ where \(N_j\) is a number of sampling observations and all sums are over observations inside considering bin. Bins takes any other feature evaluators to extract features from sample time series

  • Depends on: time, magnitude, magnitude error
  • Minimum number of observations: as required by sub-features, but at least 1
  • Number of features: as provided by sub-features

Parameters:

Name Type Description Default
features iterable

Features to extract from binned time-series

required
window positive float

Width of binning interval in units of time

required
offset float

Zero time moment

required
transform None

Not supported, apply transformations to individual features

required

Attributes:

Name Type Description
names list of str

Feature names

descriptions list of str

Feature descriptions

Methods:

Name Description
__call__

Extract features and return them as a numpy array

Parameters

t : numpy.ndarray of np.float32 or np.float64 dtype Time moments m : numpy.ndarray Signal in magnitude or fluxes. Refer to the feature description to decide which would work better in your case sigma : numpy.ndarray, optional Observation error, if None it is assumed to be unity fill_value : float or None, optional Value to fill invalid feature values, for example if count of observations is not enough to find a proper value. None causes exception for invalid features sorted : bool or None, optional Specifies if input array are sorted by time moments. True is for certainly sorted, False is for unsorted. If None is specified than sorting is checked and an exception is raised for unsorted t check : bool, optional Check all input arrays for NaNs, t and m for infinite values cast : bool, optional Allows non-numpy input and casting of arrays to a common dtype. If False, inputs must be np.ndarray instances with matched dtypes. Casting provides more flexibility with input types at the cost of performance. Returns


ndarray of np.float32 or np.float64 Extracted feature array

many

Parallel light curve feature extraction

It is a parallel executed equivalent of

def many(self, lcs, , fill_value=None, sorted=None, check=True): ... return np.stack( ... [ ... self( ... lc, ... fill_value=fill_value, ... sorted=sorted, ... check=check, ... cast=False, ... ) ... for lc in lcs ... ] ... )

Parameters

lcs : list of (t, m, sigma) or Arrow array Either a list of light curves packed into three-tuples (all numpy.ndarray of the same dtype), or an Arrow array/chunked array of type List<Struct<...>> where the selected fields share the same float dtype (float32 or float64). Arrow input is auto-detected via the arrow_c_array / arrow_c_stream protocol and enables zero-copy data access from pyarrow, polars, and other Arrow-compatible libraries. arrow_fields : list of (str or int) Required when lcs is an Arrow array. Field names or indices specifying which struct fields to use as t, m, and optionally sigma. Must contain 2 elements [t, m] or 3 elements [t, m, sigma]. Each element may be a field name (str) or a zero-based positional index (int); all elements must be of the same type. Ignored for non-Arrow input. fill_value : float or None, optional Fill invalid values by this or raise an exception if None sorted : bool or None, optional Specifies if input array are sorted by time moments, see call documentation for details check : bool, optional Check all input arrays for NaNs, t and m for infinite values n_jobs : int Number of tasks to run in paralell. Default is -1 which means run as many jobs as CPU count. See rayon rust crate documentation for details


JSONDeserializedFeature

light_curve.JSONDeserializedFeature

Bases: light_curve.light_curve_ext._FeatureEvaluator

Feature evaluator deserialized from JSON string