Control Data Flow¶
Save / restore light-curve state, emit output columns or files, and take branching decisions inside a pipeline.
savelc / restorelc — Light-curve state snapshots¶
Syntax
Description
Checkpoint and restore the in-memory light-curve state. savelc saves a snapshot of the current light curve state; restorelc restores a previous snapshot. Useful for running multiple analysis branches on the same underlying data without re-reading the file from disk, and for undoing a destructive transformation.
Each savelc call is numbered in the order it appears (1, 2, 3, …). restorelc(savenumber=N) restores the N-th save point. Conditional constructs (-if / -elif / -else / -fi) are ignored by savelc and restorelc — they always execute.
CLI equivalent: -savelc / -restorelc.
Parameters
| Parameter | Type | Description |
|---|---|---|
savenumber |
int |
(restorelc only.) Index of the savelc snapshot to restore (1-based). |
vars |
str or list[str] or None |
(restorelc only.) Restore only the named variables instead of the full light curve. |
Output
savelc and restorelc produce no output of their own — they only mutate the in-memory light-curve state seen by subsequent commands.
Examples
lcs = [vt.LightCurve.from_file(f"EXAMPLES/{i}") for i in range(1, 11)]
# Run LS and BLS with different clipping levels; restore between them
pipe = (vt.Pipeline()
.savelc()
.clip(5.0)
.savelc()
.LS(0.1, 100.0, 0.1, npeaks=3, clip=5.0, clipiter=1)
.aov(0.1, 100.0, 0.1, 0.01, npeaks=1, clip=5.0, clipiter=1)
.restorelc(savenumber=1)
.clip(10.0)
.BLS(0.1, 20.0, rmin=0.01, rmax=0.1, nbins=200, nfreq=10000, npeaks=1)
.restorelc(savenumber=2)
.changeerror())
batch = pipe.run_batch(lcs, nthreads=4)
print(batch.vars)
o — Output light curve¶
Syntax
cmd.o(outname=None, outdir=None, nameformat=None, columnformat=None,
allcols=False, fits=False, noclobber=False, copyheader=False,
namecommand=None, namefromlist=None, changesuffix=None,
delimiter=None, logcommandline=False, gzip=False, bzip2=False,
capture=False, key="o")
Description
Write the current light curve to a file, capture it back into a Result object, or both. Useful for saving intermediate state mid-pipeline (e.g. after sigma-clipping or filtering) and for the final output of a processed batch.
The CLI -o keyword takes a single positional argument that is interpreted as a filename in single-LC mode (vartools -i ...) and as a directory in list mode (vartools -l ...). pyvartools splits this dual semantics into two explicit kwargs:
outname=is used when the pipeline runs throughPipeline.runorPipeline.run_file(single-LC mode);outdir=is used when it runs throughPipeline.run_filelist,Pipeline.run_batch, orPipeline.run_combinelcs(list mode).
A pipeline that supplies both can be reused in either mode; if the wrong one is supplied for the run method invoked, a clear RuntimeError is raised at run time.
cmd.o can be used in three modes:
- Write to disk only (
outname=oroutdir=set,capture=False): the LC is saved to disk. - Capture only (
capture=True, nooutname/outdir): the LC is captured intoresult.files[key]. In library mode this is purely in-memory (vartools snapshots the LC variables into a buffer keyed bykey; no temporary file is ever written). In subprocess fallback mode the LC is written to a mode-appropriate temporary path and read back; pyvartools cleans the temp file up. - Write and capture (
capture=Trueplusoutname=oroutdir=): both saved to disk and captured.
When capture=True and no explicit columnformat is given, pyvartools passes the allcols flag to -o so the captured DataFrame contains every LC-vector variable registered up to that point in the pipeline (matching the library-mode fast path). If columnformat is given, the captured DataFrame uses the variable names listed in it.
CLI equivalent: -o.
Parameters
| Parameter | Type | Description |
|---|---|---|
outname |
str or None |
Output filename for single-LC runs (Pipeline.run / run_file). Use "-" for stdout. |
outdir |
str or None |
Output directory for list/batch runs (Pipeline.run_filelist / run_batch / run_combinelcs). Per-LC filenames are constructed inside it. Combine with outname=PerLC([...]) to override the per-LC basename from Python — see Per-LC output filenames. |
nameformat |
str or None |
Format string for output filenames in list mode, e.g. "file_%s_%05d.txt" (%s = LC basename, %d = sequence number). Ignored in single-LC mode. |
columnformat |
str or None |
Output column spec, e.g. "t:%17.9f,mag:%9.5f,err:%9.5f". |
allcols |
bool |
Write every light-curve-vector variable defined by commands before this cmd.o in the pipeline, with a type-appropriate default printf format and a # name1 name2 … header line for ASCII output. Mutually exclusive with columnformat. Handy when a prior command has created new vectors (e.g. cmd.Phase(..., phasevar="ph"), cmd.linfit(..., modelvar="m")) that you want to capture without listing each one. Default False. |
fits |
bool |
Write output in FITS binary table format. |
noclobber |
bool |
Do not overwrite an existing output file. |
copyheader |
bool |
Copy the FITS header from the input file to the output file. |
namecommand |
str or None |
Shell command used to generate the output filename dynamically (list mode only). |
namefromlist |
bool, str, or None |
Derive output filename from the input list (list mode only). True uses the default column; a string specifies a per-LC variable name to read the basename from. To supply per-LC output names from Python rather than from a column already on the list file, pass the names via perlc_vars on run_batch() or LightCurveBatch.run() — see Per-LC values from Python. |
changesuffix |
tuple[str, str] or None |
After the default basename has been built, strip a trailing old_suffix (if present) and append new_suffix. Either may be empty. Applied before any fits / gzip / bzip2 suffix. Mutually exclusive with nameformat / namecommand / namefromlist. E.g. changesuffix=(".fits", ".txt") rewrites foo.fits → foo.txt. List-mode only. |
delimiter |
str or None |
Column delimiter character for the output file (default: whitespace). |
logcommandline |
bool |
Record the full pipeline invocation into the output file header (for provenance). |
gzip / bzip2 |
bool |
Compress the output. The corresponding .gz / .bz2 extension is appended if not already present, and the data are piped through the gzip or bzip2 external program (must be on PATH). Combined with fits=True, gzip=True produces a gzip-compressed FITS file via cfitsio's native .fits.gz driver; bzip2=True cannot be combined with fits=True. Compression cannot be combined with stdout (outname="-") when fits=True. Compressed inputs (.gz, .Z, .bz2 for ASCII; .fits.gz, .fits.fz, .fits.Z, .fits.bz2 for FITS) are auto-detected and decompressed on read. Mutually exclusive. |
capture |
bool |
If True, capture the written light curve into result.files[key]. For single-LC runs this is a LightCurve; for batch runs it is a list of LightCurve objects. When neither outname nor outdir is supplied, the output goes to a mode-appropriate temporary path that is cleaned up automatically. Default False. |
key |
str |
Key under which the captured LC(s) appear in result.files. Default "o". Use a unique key when the pipeline contains more than one cmd.o(capture=True). |
Caveat — FITS input, ASCII output. In batch mode, when the input list contains FITS light curves and the output is ASCII (fits=False), the default output filename follows the input basename — so kplr.fits is written as ASCII to a file also named kplr.fits. The .fits suffix is then misleading: the file holds plain text. Use changesuffix to rewrite it:
For arbitrary renaming (multi-part suffixes, inserting tags, etc.) use namecommand with sed:
Output
cmd.o writes one light-curve file per LC and adds no columns to the per-LC statistics table.
When capture=True:
| File key | Description |
|---|---|
result.files[key] |
Captured light curve. For single-LC runs, a LightCurve; for batch runs, a list of LightCurve objects (one per input LC, None if the file is missing). The default key is "o"; set key="..." to disambiguate multiple capturing cmd.o calls in the same pipeline. |
Examples
import pyvartools as vt
from pyvartools import commands as cmd
lc = vt.LightCurve.from_file("EXAMPLES/2")
# Capture the intermediate light curve state (no file left on disk)
pipe = (vt.Pipeline()
.clip(5.0)
.o(capture=True, key="clipped")
.LS(0.1, 10.0, 0.1, npeaks=1))
result = pipe.run(lc)
clipped_lc = result.files["clipped"] # LightCurve after sigma-clipping
print(result.vars["LS_Period_1_2"]) # clip=0, o=1, LS=2
# Write to disk AND capture
result2 = (vt.Pipeline()
.clip(5.0)
.o(outname="EXAMPLES/OUTDIR1/2.clipped", capture=True, key="clipped")).run(lc)
# File written to disk and also available as result2.files["clipped"]
# Multiple intermediate snapshots
pipe3 = (vt.Pipeline()
.clip(5.0)
.o(capture=True, key="after_clip")
.medianfilter(0.05)
.o(capture=True, key="after_filter"))
result3 = pipe3.run(lc)
after_clip = result3.files["after_clip"]
after_filter = result3.files["after_filter"]
# Batch: result.files["o"] is a list of LightCurves, one per input LC
lcs = [vt.LightCurve.from_file(f"EXAMPLES/{i}") for i in range(1, 11)]
batch = (vt.Pipeline()
.LS(0.1, 100.0, 0.1, npeaks=1)
.expr("phase=t")
.changevariable("t", "phase")
.Phase(period="ls")
.o(capture=True, key="phased")).run_batch(lcs)
phased_lcs = batch.files["phased"] # list of 10 LightCurve objects
# Write to a named directory with a custom nameformat. Both run_filelist
# (paths) and run_batch (in-memory LightCurves) work — the per-LC output
# basenames come from the input filename for run_filelist, or from each
# LC's .name attribute for run_batch.
(vt.Pipeline()
.LS(0.1, 100.0, 0.1, npeaks=1)
.Phase(period="ls")
.o(outdir="EXAMPLES/OUTDIR1",
nameformat="file_%s_%05d_simout.txt",
columnformat="t:%11.5f,mag:%7.4f,err:%7.4f")).run_filelist([f"EXAMPLES/{i}" for i in range(1, 11)])
Performance: library mode
Most cmd.o configurations run in pyvartools' in-process library
mode when libvartoolspipeline is installed, skipping the per-call
subprocess fork:
outname=PATH(single-LC) andoutdir=DIR(batch): library mode writes the file directly from inside the C call; the output file is byte-identical to subprocess mode.capture=Truewith no path: library mode handles this entirely in memory. No temporary file is written, no temp directory is allocated. Multiplecmd.o(capture=True)snapshots at distinct points in one pipeline are produced in a single library call. Per-call cost is typically ~2 ms vs ~50 ms for subprocess.
capture=True combined with outname=/outdir= (write and capture
both) also runs in library mode via the C-side capture_id keyword:
the file is written and the post-write LC arrays are pulled into
result.files[key] in one library call.
Pipelines mixing cmd.o(...) with auxiliary save_*=True outputs
(e.g. save_periodogram) also run in library mode — the C-side
writers fopen/fwrite into a per-Pipeline tmpdir during the in-
process call, and pyvartools reads the files back with the
existing parsers. Performance is between full library mode and
subprocess (still does the disk I/O for the side-output files);
point TMPDIR at /dev/shm to make those writes RAM-only.
Per-LC output filenames¶
Batch runs (run_batch, LightCurveBatch.run(), run_filelist with
in-memory LCs) default to <outdir>/<lc.name> for each LC's output
file. To pick the per-LC basenames from Python instead — without
needing a list file or perlc_vars plumbing — wrap a list of strings
in PerLC and pass it as outname:
import os, tempfile
import pyvartools as vt
from pyvartools import commands as cmd
lcs = [vt.LightCurve.from_file(f"EXAMPLES/{i}") for i in range(1, 4)]
outdir = tempfile.mkdtemp(prefix="cmd_o_perlc_outname_")
names = [f"star_{i:03d}.lc" for i in range(1, 4)]
batch = vt.Pipeline([
cmd.clip(5.0),
cmd.o(outdir=outdir, outname=vt.PerLC(names), allcols=True),
]).run_batch(lcs)
assert sorted(os.listdir(outdir)) == sorted(names)
The PerLC list must have one entry per light curve in the batch.
Both outdir= and the PerLC outname= are required together;
outname=PerLC([...]) without outdir= raises a clear ValueError
at run-batch entry. The same pipeline works in subprocess mode
(VARTOOLS_USE_LIBRARY=0) and produces byte-identical output files.
Multiple cmd.o(outname=PerLC(...)) instances in a single pipeline
each map to their own outdir with their own per-LC names — pyvartools
allocates a distinct synthetic inlist variable per cmd.o. Combining
with capture=True also works: the file is written under
<outdir>/<name_i> and the LC is captured into result.files[key].
The auto-rewrite (which translates outname=PerLC([...]) into a
synthetic namefromlist + per-call inlist update) leaves the user's
cmd.o instance unchanged after the run via try/finally, so a
Pipeline reused across multiple calls behaves identically each time.
print_cols — Emit user-computed variables to the output table¶
Syntax
Description
Include the values of one or more user-computed variables (e.g. results of expr commands, per-LC variables from perlc_vars, or carried-forward scalars) as additional columns in the per-LC statistics table. This is the primary way to surface user-defined scalars in the final results table. Light-curve vectors are reduced to their first element.
The Python class is named print_cols because print is a Python built-in.
CLI equivalent: -print.
Parameters
| Parameter | Type | Description |
|---|---|---|
variables |
str or list[str] |
Variable names to print. Pass a comma-separated string or a list of strings. |
columnnames |
str or list[str] or None |
Override the auto-generated column names. Default names are Print_<var>_<idx>_<cmd>. |
fmt |
str or list[str] or None |
Printf-style format specifiers for each column (e.g. "%.6f"). |
Output
Per command index N:
| Column | Description |
|---|---|
Print_<var>_<j>_N |
Value of the j-th requested variable (0-based). Replaced by the entry in columnnames when supplied. The trailing _N suffix is omitted when -columnsuffix is in effect for the pipeline. |
Examples
Print five variables to the output statistics table — the LC name, x/y coordinates from the input list, the RMS_0 output column, and the first value of mag (since mag is a per-LC vector, print_cols emits its first element).
batch = (vt.Pipeline()
.rms()
.print_cols("name,x,y,RMS_0,mag",
fmt="%20s,%.2f,%.2f,%.3f,%.3f")
).run_filelist("EXAMPLES/lc_list_tfa_sr_bin",
perlc_vars={
"name": vt.PerLCColumn(col=1, type="string"),
"x": vt.PerLCColumn(col=2),
"y": vt.PerLCColumn(col=3),
})
addfitskeyword — Add a FITS keyword¶
Syntax
Description
Add a FITS header keyword to any subsequently output FITS-format light curve. Use this together with cmd.o(..., fits=True) to attach metadata derived from list-file columns, expressions, or fixed constants.
value may be a Python scalar (auto-wrapped with the "fix" keyword) or a full vartools token string such as "var myvar". dtype is one of "TDOUBLE", "TINT", "TLONG", or "TSTRING".
CLI equivalent: -addfitskeyword.
Parameters
| Parameter | Type | Description |
|---|---|---|
keyword |
str |
FITS keyword name (max 8 characters). When combinelc is set, may contain %d, replaced by each unique LC-number value. |
dtype |
str |
Data type: "TDOUBLE", "TINT", "TLONG", or "TSTRING". |
value |
scalar or str |
Keyword value. Bare scalars (int/float) are auto-prefixed with "fix"; pass a string like "var myvar" or "fix 3.14" to control the form explicitly. |
comment |
str or None |
Optional FITS comment string. |
hdu |
str or None |
"primary" (default) to place the keyword in the primary header, or "extension" for the binary-table extension header. |
mode |
str or None |
"append" to add a duplicate keyword if one already exists, or "update" (default) to overwrite. |
combinelc |
str or None |
Variable name holding the LC number for combined-LC mode; emits one keyword per unique value. |
Output
addfitskeyword produces no output of its own — it only attaches metadata to the next FITS file written by cmd.o(..., fits=True).
Examples
Convert ASCII EXAMPLES/1 to a FITS LC at EXAMPLES/1.tmpout.fits, attaching a string keyword TMPKEY taken from the input-list variable x (which evaluates to "HELLO" for this list line).
# A two-column list file ("EXAMPLES/lc_list_addfitskey" contains
# "EXAMPLES/1 HELLO") supplies the per-LC value of x via perlc_vars.
batch = (vt.Pipeline()
.addfitskeyword("TMPKEY", "TSTRING", value="var x",
comment="a comment")
.o(outdir="EXAMPLES/", nameformat="%s.tmpout.fits", fits=True)
).run_filelist("EXAMPLES/lc_list_addfitskey",
perlc_vars={"x": vt.PerLCColumn(col=2, type="string")})
changevariable — Reassign standard column roles¶
Syntax
Description
Reassign the internal role of one of the four standard columns — time ("t"), magnitude ("mag"), uncertainty ("err"), or image identifier ("id") — to a different named light-curve variable. Subsequent commands then use the new assignment. The original variable still exists; to restore it, issue cmd.changevariable("mag", "mag").
Useful when reading FITS files with non-standard column mappings, or for temporarily swapping the time axis to a derived variable such as phase before running a phase-folded operation.
CLI equivalent: -changevariable.
Parameters
| Parameter | Type | Description |
|---|---|---|
column |
str |
Which built-in role to reassign. One of "t", "mag", "err", or "id". |
var |
str |
Name of the existing light-curve column to promote to that role. |
Output
changevariable produces no output of its own — it only changes which named variable subsequent commands treat as the time/magnitude/error/id column.
Examples
pipe = (vt.Pipeline()
.LS(0.1, 100.0, 0.1, npeaks=1)
.expr("phase=t")
.changevariable("t", "phase")
.Phase(period="ls")
.changevariable("t", "t"))
ifcmd / elifcmd / elsecmd / ficmd — Conditional execution¶
Syntax
Description
Build a conditional block: commands placed between ifcmd(...) / elifcmd(...) / elsecmd() and the closing ficmd() execute only when the corresponding condition is satisfied. A block must be closed with ficmd(). Nested blocks are supported.
If condition evaluates to 0 (cast to integer), commands inside that branch are skipped; any non-zero integer value causes the branch to execute. Conditions may reference any per-LC scalar produced by an earlier command in the pipeline (e.g. RMS_0, Log10_LS_Prob_1_0).
The classes are named ifcmd / elifcmd / elsecmd / ficmd because if, elif, else, and fi overlap with Python reserved words.
CLI equivalent: -if / -elif / -else / -fi.
Caution
Conditional constructs are ignored by commands that process all light curves simultaneously (SYSREM, findblends) and by savelc / restorelc — those always execute.
Parameters
| Parameter | Type | Description |
|---|---|---|
condition |
str |
(ifcmd and elifcmd only.) Analytic expression evaluated to decide whether to execute the branch. May reference any variable computed by a preceding command, light-curve vectors, or scalar constants. The branch is taken when the expression evaluates to a non-zero integer. |
Output
The conditional wrappers produce no output of their own; the columns and files written by the pipeline are determined by the commands inside each branch.
Examples
Run an LS period search and apply a harmonic filter only to light curves with strong periodicity.
batch = (vt.Pipeline()
.LS(0.5, 20.0, 4.0, npeaks=1)
.ifcmd("Log10_LS_Prob_1_0 < -10.0")
.harmonicfilter("ls", nharm=2, nsubharm=0,
save_model="EXAMPLES/OUTDIR1")
.ficmd()
).run_filelist("EXAMPLES/lc_list")
Apply different statistics depending on the values returned by rms. Each branch runs in the same vartools invocation, so output-column suffixes keep their positional indices across branches.
lcs = [vt.LightCurve.from_file(f"EXAMPLES/{i}") for i in range(1, 11)]
pipe = (vt.Pipeline()
.rms()
.ifcmd("RMS_0>10*Expected_RMS_0")
.stats("mag", "stddev")
.elifcmd("Npoints_0>3000")
.stats("mag", "kurtosis")
.elsecmd()
.rms()
.ficmd())
batch = pipe.run_batch(lcs)