LightCurve¶
pyvartools.LightCurve is the primary data container for a single photometric time series. Internally it stores data in a pandas DataFrame. The three standard columns — t (time), mag (magnitude), and err (magnitude uncertainty) — are treated specially when present, but all three are optional; any combination of columns is accepted, and pyvartools automatically informs VARTOOLS of the column layout when the light curve is processed. An optional name attribute provides a string label that VARTOOLS uses as the light curve identifier in its output table.
Construction methods¶
LightCurve.from_file(path, format=None, t_col=<unset>, mag_col=<unset>, err_col=<unset>, hdu=1, name='')¶
Load a light curve from disk. The file format is auto-detected from the file extension:
-
Files whose names end in
.fits,.fit, or.fts(case-insensitive) are read as FITS binary tables usingastropy.io.fits. Every column in the table HDU is loaded into the resulting LightCurve under its original FITS column name. The user must explicitly tellfrom_filewhich (if any) columns correspond tot,mag, anderrviat_col=,mag_col=,err_col=:- Pass a string FITS column name to map that column to
t/mag/err. - Pass
Noneto indicate the LC has no such column (vartools defaultst=NR,mag=0,err=1at run time). - Leaving any of the three unset raises
ValueErrorlisting the available FITS columns so you can decide.
- Pass a string FITS column name to map that column to
-
ASCII files (anything not matching the FITS extensions) are whitespace-delimited. If the file has three or more columns the first three are named
t,mag,errand any further columns are namedcol4,col5, …. If the file has fewer than three columns the columns are namedcol1,col2, … (since the semantic meaning cannot be inferred). Compressed ASCII files (.gz,.Z,.bz2) are detected by extension and read transparently.t_col,mag_col,err_colare FITS-specific and have no effect for ASCII inputs.
All columns present in the resulting DataFrame are accessible by name
to any command when a Pipeline run method is called.
name defaults to the stem of path (filename without directory or
extension) if not supplied.
Column layout for disk-based pipeline runs
from_file always reads columns 1–3 as t, mag, err for ASCII files.
If your ASCII file has a different layout (e.g. time in column 3), load the
file manually or use from_arrays, then pass the correct mapping to
Pipeline.run_file() via its columns parameter rather than here.
# ASCII, columns 1-2-3
lc = vt.LightCurve.from_file("EXAMPLES/2")
# Gzipped ASCII — handled transparently.
lc_gz = vt.LightCurve.from_file("EXAMPLES/1.gz")
# FITS: explicit column mapping is required. All FITS columns also land
# in the LC's DataFrame under their original names.
lc = vt.LightCurve.from_file(
"EXAMPLES/2.fits",
t_col="time", mag_col="mag", err_col="err",
)
# lc._df.columns == ['t', 'mag', 'err', 'time', 'mag', 'err', ...]
# FITS where no t-equivalent column exists — pass None, and vartools
# defaults t=NR at run time.
# lc = vt.LightCurve.from_file(
# "no_time.fits", t_col=None, mag_col="flux", err_col="flux_err",
# )
LightCurve.from_arrays(t, mag, err, aux=None, name='')¶
Construct a LightCurve directly from NumPy arrays (or anything that converts
to a 1-D NumPy array). All three arrays must have the same length.
t, mag, and err are all optional — pass None (or omit them) when VARTOOLS will generate or ignore those vectors internally. aux is an optional dict for any additional named columns. All columns present in the resulting DataFrame are accessible by name to any command when a Pipeline run method is called.
import numpy as np
t = np.linspace(0, 30, 300)
mag = 10.0 + 0.1 * np.sin(2 * np.pi * t / 2.3)
err = np.full(300, 0.01)
airmass = 1.0 + 0.5 * np.abs(np.sin(np.pi * t / 15.0))
# Standard three columns plus an extra
lc = vt.LightCurve.from_arrays(t, mag, err, aux={"airmass": airmass}, name="my_star")
# Only t and mag — no uncertainty column
lc2 = vt.LightCurve.from_arrays(t=t, mag=mag)
# Only auxiliary data — no standard columns at all
phase_arr = (t % 2.3) / 2.3 # phase-fold at 2.3 d period
flux_arr = 1.0 - 0.01 * np.sin(2 * np.pi * phase_arr)
lc3 = vt.LightCurve.from_arrays(aux={"phase": phase_arr, "flux": flux_arr})
LightCurve.from_files(paths, name='', lcnum_col='lcnum', sort=True, **read_kwargs)¶
Read several light-curve files and combine them into a single LightCurve. Each file is loaded via from_file, the resulting frames are concatenated, and an integer lcnum_col is filled in (0 for the first file, 1 for the second, …). The combined frame is time-sorted by default.
from_files is appropriate when a single combined LC needs to be fed to Pipeline.run(lc) or to any non-batch entry point. (For combining files inside the pipeline run itself, see Pipeline.run_combinelc() / run_combinelcs().)
lc = vt.LightCurve.from_files(["EXAMPLES/2", "EXAMPLES/3"])
print(list(lc._df.columns)) # ['t', 'mag', 'err', 'lcnum']
print(lc._df["lcnum"].max()) # 1
# Preserve file order instead of time-sorting:
lc2 = vt.LightCurve.from_files(["EXAMPLES/2", "EXAMPLES/3"], sort=False)
# Rename the source-file column:
lc3 = vt.LightCurve.from_files(["EXAMPLES/2", "EXAMPLES/3"], lcnum_col="segment")
For FITS inputs, forward the required t_col= / mag_col= / err_col=
keyword arguments to from_file (FITS has no defaults — see the
from_file description
for the rationale):
# lc = vt.LightCurve.from_files(
# ["sectorA.fits", "sectorB.fits"],
# t_col="BJD_TDB", mag_col="MAG", err_col="ERR_MAG",
# )
LightCurve.from_dataframe(df, name='')¶
Wrap an existing pd.DataFrame. Any DataFrame is accepted — columns named t,
mag, and err are treated as the standard VARTOOLS vectors when present;
all other columns are preserved as auxiliary variables. None of the three
standard columns are required.
import pandas as pd
df = pd.read_csv("EXAMPLES/example.csv") # columns: t, mag, err
lc = vt.LightCurve.from_dataframe(df, name="example")
LightCurve.from_timeseries(ts, mag_col='mag', err_col='err', name='')¶
Build a LightCurve from an astropy.timeseries.TimeSeries object. The time
column is extracted from ts.time and converted to numeric days (BJD or the
native scale of the Time object). mag_col and err_col select the
magnitude and uncertainty columns by name.
import numpy as np
from astropy.timeseries import TimeSeries
from astropy.time import Time
import astropy.units as u
# Build a TimeSeries from arrays (e.g. after reading a mission file with astropy)
times = Time(2459000 + np.linspace(0, 30, 300), format="jd", scale="tdb")
ts = TimeSeries(time=times)
ts["sap_flux"] = (10.5 + 0.08 * np.sin(2 * np.pi * np.linspace(0, 30, 300) / 2.3)) * u.mag
ts["sap_flux_err"] = np.full(300, 0.005) * u.mag
lc = vt.LightCurve.from_timeseries(ts, mag_col="sap_flux", err_col="sap_flux_err")
Properties¶
| Property | Type | Description |
|---|---|---|
.t |
np.ndarray or None |
Time values (days), or None if the column is absent. |
.mag |
np.ndarray or None |
Magnitude (or flux) values, or None if absent. |
.err |
np.ndarray or None |
Per-point magnitude uncertainties, or None if absent. |
.name |
str |
String label for this light curve. |
.scalars |
dict[str, float] |
Per-star scalar variables (see below). |
.cols |
list[str] |
All column names, including auxiliary columns (see below). |
.shape |
tuple[int, int] |
(n_observations, n_columns) — mirrors DataFrame.shape. |
.fitsheader |
astropy.io.fits.Header or None |
Preserved FITS metadata (see below). |
All three array properties return None when the corresponding column is not
present in the underlying DataFrame. To modify data, create a new LightCurve
from the modified arrays.
Auxiliary columns¶
Extra columns beyond the standard t / mag / err (e.g. airmass,
pixel coordinates, quality flags, per-point reference mags) are stored
alongside the standard three in the same backing DataFrame. Access
them by name with lc[col]:
import numpy as np
import pyvartools as vt
t = np.arange(100, dtype=float)
mag = 10.0 + 0.01 * np.sin(t / 5)
err = np.full(100, 0.01)
airmass = 1.5 + 0.001 * t
lc = vt.LightCurve.from_arrays(
t, mag, err,
aux={"airmass": airmass},
name="star1",
)
# List all columns
print(lc.cols) # ['t', 'mag', 'err', 'airmass']
# Test for membership
print("airmass" in lc) # True
print("missing" in lc) # False
# Read a column — returns a numpy array (no copy when the dtype is
# numpy-compatible). `lc['t']` is equivalent to `lc.t` for the
# three standard columns.
print(lc["airmass"][:3]) # [1.5 1.501 1.502]
Indexing with a missing name raises KeyError; indexing with a
non-string raises TypeError. For row-level access, use
lc.to_dataframe() (which returns a copy).
When the LightCurve is passed to a Pipeline, pyvartools automatically
builds a -inputlcformat flag listing every column name, so any
vartools command (for example expr, changevariable, linfit) can
reference the auxiliary column by its name:
# The 'airmass' column is visible to -expr because it's in the DataFrame.
result = lc.expr("detrended = mag - 0.05*airmass")
print("detrended" in result.lc.to_dataframe().columns) # True
.scalars¶
The canonical per-star scalar store. A dictionary of scalar variables
associated with this light curve, populated automatically by pyvartools
when a run captures its output LC (capture_lc=True) and used as input
to the next segment in a chain.
For single-LC runs, read the scalars directly as result.lc.scalars;
for batches, BatchResult.lcscalars collates every LC's dict into a
DataFrame. The values live here, not in the Result, so to change
what flows into subsequent chain segments mutate result.lc.scalars
directly (or build a new LightCurve with a different scalars=
kwarg).
Its primary role is to carry state across chained pyvartools commands:
when a Result.LS(...).expr(...) chain runs, the prior segment's
output columns (and any user-defined scalars) are attached to the next
segment's input LightCurve via this dict, so downstream analytic
expressions can reference them by name.
Keys are the raw vartools variable names (for OUTCOLUMN values carried
forward this is the name with the _N suffix, e.g.
"LS_Period_1_0"; for user-defined scalars it is the bare name, e.g.
"myvar"). Values are Python float / int scalars.
# Typical flow — users rarely touch .scalars directly; it's populated
# by the chain machinery and harvested from -printallscalars output.
r1 = lc.LS(0.5, 10.0, 0.1) # r1.lc.scalars is empty on first run
r2 = r1.expr("doubled=2*LS_Period_1_0", vartype="scalar")
# Inside the .expr() call, pyvartools attaches r1's output values to the
# input LightCurve's .scalars and injects them into the new vartools run
# via `-expr const`. See the Chaining API docs for details.
r2.lc.scalars["doubled"] # == 2 * r1.vars["LS_Period_1_0"]
You can also construct a LightCurve with an explicit scalars dict — useful
for manual chaining or for seeding values in tests:
lc = vt.LightCurve.from_arrays(
t=t, mag=mag, err=err,
scalars={"P0": 1.234, "offset": 0.05},
)
# Subsequent commands referencing "P0" or "offset" in expressions will
# resolve to these values.
.fitsheader¶
The FITS header metadata carried by this light curve, or None if it
did not come from a FITS file. Stored as an astropy.io.fits.Header
instance — a proper FITS header, not a plain dict, so it supports
COMMENT / HISTORY cards, keyword comments, and correct card
ordering.
When LightCurve.from_file() reads a FITS file, the primary HDU's
header is merged with the data-extension HDU's header (extension wins
on conflict), structural keywords are filtered out, and the result is
stored here. When .to_file() writes the light curve back to FITS,
these preserved keywords are re-emitted onto the output's primary
HDU. The data extension gets freshly-derived structural
keywords from the current DataFrame — so the column layout
matches what the LightCurve actually contains.
Keywords that are stripped on read and write (they must be redetermined from the current data):
- Global structure:
SIMPLE,BITPIX,EXTEND,XTENSION,PCOUNT,GCOUNT,TFIELDS,NAXIS,NAXISn,EXTNAME,EXTVER,END. - Per-column:
TTYPEn,TFORMn,TDISPn,TSCALn,TZEROn,TNULLn,TBCOLn,TUNITn,TCTYPn,TCRPXn,TCRVLn,TCDLTn,TCUNIn,TCROTn,TDIMn.
Everything else round-trips faithfully, including observational
metadata (TELESCOP, OBJECT, DATE-OBS, …) and COMMENT /
HISTORY cards.
import os, tempfile
import pyvartools as vt
from astropy.io import fits
# EXAMPLES/10.fits has columns named "time", "mag", "err".
lc = vt.LightCurve.from_file("EXAMPLES/10.fits",
t_col="time", mag_col="mag", err_col="err")
# Edit / annotate like a regular astropy Header
lc.fitsheader["TELESCOP"] = "HATPI"
lc.fitsheader["OBJECT"] = "V1234 Cyg"
lc.fitsheader.add_history("Filtered and detrended with pyvartools")
# Write it out — observational keys go on the primary HDU,
# column structure is regenerated on the extension HDU.
with tempfile.TemporaryDirectory() as td:
out = os.path.join(td, "annotated.fits")
lc.to_file(out)
# Verify the round-trip preserved everything that matters.
with fits.open(out) as hdul:
print(hdul[0].header["TELESCOP"]) # 'HATPI'
print(hdul[0].header["OBJECT"]) # 'V1234 Cyg'
print(hdul[1].header["TFIELDS"]) # 3 (re-derived, not preserved)
You can also attach a header to a LightCurve built from arrays:
import numpy as np
from astropy.io import fits
hdr = fits.Header()
hdr["TELESCOP"] = "HATPI"
hdr["OBJECT"] = "Synthetic"
t = np.linspace(0, 30, 300)
lc = vt.LightCurve.from_arrays(
t, 10.0 + 0.05 * np.sin(t), np.full(300, 0.02),
fitsheader=hdr,
)
Passing a plain dict is accepted too — it is converted to a
Header internally.
Conversion methods¶
.to_dataframe() → pd.DataFrame¶
Return the full underlying DataFrame, including any auxiliary columns. The
index is a default integer index; the three core columns are always named t,
mag, and err.
.to_arrays() → (t, mag, err)¶
Return the three core arrays as a tuple of NumPy arrays. Equivalent to
(lc.t, lc.mag, lc.err) but convenient for unpacking.
.to_timeseries() → astropy.timeseries.TimeSeries¶
Convert to an astropy TimeSeries. The t column is converted to an
astropy.time.Time object (scale 'tdb', format 'jd'). Magnitude and
uncertainty are stored as columns named mag and err. Auxiliary columns are
included as additional columns.
import tempfile
ts = lc.to_timeseries()
with tempfile.NamedTemporaryFile(suffix=".ecsv", delete=False) as f:
ts.write(f.name, format="ascii.ecsv", overwrite=True)
.to_file(path, format=None) — serialise to disk¶
Write the light curve to an ASCII or FITS file. format is
auto-detected from the file extension when omitted (.fits/.fit/.fts
→ FITS; everything else → ASCII). ASCII output is whitespace-separated
with 10 decimal places of precision and no header row, matching the
vartools default. FITS output requires astropy.
import tempfile, pyvartools as vt
lc = vt.LightCurve.from_file("EXAMPLES/2")
with tempfile.TemporaryDirectory() as td:
lc.to_file(f"{td}/out.txt") # ASCII (auto by extension)
lc.to_file(f"{td}/out.dat", format="ascii") # ASCII (explicit)
lc.to_file(f"{td}/out.fits") # FITS (auto by extension)
Utility methods¶
.plot(ax=None, **kwargs) → matplotlib.axes.Axes¶
Quick-look plot of the light curve. Uses ax.errorbar when an err
column is present, otherwise falls back to ax.plot. The y-axis is
inverted automatically to match astronomical magnitude convention; the
light curve's name is used as the plot title when set. Any keyword
arguments are forwarded to the underlying errorbar / plot call, so
defaults like fmt, markersize, color, etc. can be overridden.
Raises ValueError if the light curve has no t or mag column.
Requires matplotlib.
import matplotlib
matplotlib.use("Agg") # headless backend — skip for interactive use
import matplotlib.pyplot as plt
import pyvartools as vt
lc = vt.LightCurve.from_file("EXAMPLES/2")
# Default quick-look with errorbars
ax = lc.plot()
# Override styling (e.g. for a larger, coloured-blue figure on existing axes)
fig, ax = plt.subplots(figsize=(6, 3))
lc.plot(ax=ax, color="C0", markersize=2)
Special methods¶
| Operation | Result |
|---|---|
len(lc) |
Number of data points. |
lc[col] |
Column as a numpy array (KeyError if missing, TypeError for non-string keys). |
col in lc |
True if col is a column name. |
repr(lc) |
Summary string — e.g. LightCurve(name='my_star', n=300, cols=['t', 'mag', 'err']). |
Full example¶
import numpy as np
import pyvartools as vt
# From file (auto-detects ASCII)
lc = vt.LightCurve.from_file("EXAMPLES/2")
# From FITS — t_col/mag_col/err_col are required (no defaults). Pass the
# matching FITS column names, or None for any LC that doesn't have that
# column.
lc = vt.LightCurve.from_file("EXAMPLES/example.fits",
t_col="BJD", mag_col="Mag", err_col="Err")
# From arrays
t = np.linspace(0, 30, 300)
mag = 10.0 + 0.1*np.sin(2*np.pi*t/2.3)
err = np.full(300, 0.01)
lc = vt.LightCurve.from_arrays(t, mag, err, name="my_star")
print(lc) # LightCurve(name='my_star', n=300, cols=['t', 'mag', 'err'])
print(len(lc)) # 300
# Round-trip through a DataFrame
df = lc.to_dataframe()
lc2 = vt.LightCurve.from_dataframe(df, name="my_star")