Getting Started¶

This page walks through a few simple VARTOOLS examples, showing both the CLI and Python (pyvartools) approaches side by side. All examples assume that the VARTOOLS source directory is the working directory, and that the EXAMPLES/ directory is present.

Installation

If you have not yet installed VARTOOLS, see the Installation page before continuing.

1. A first VARTOOLS run¶

The simplest example is the calculation of the root-mean-square (RMS) scatter of a light curve.

CLIPython

vartools -i EXAMPLES/2 -rms -oneline
# Output:
# Name        = EXAMPLES/2
# Mean_Mag_0  = 10.12178
# RMS_0       =  0.05012
# ...

-i EXAMPLES/2 — read a single light curve from the file EXAMPLES/2.
-rms — compute the RMS and mean magnitude.
-oneline — print results in a human readable format, with one statistic per line, per light curve.

import pyvartools as vt

result = vt.rms("EXAMPLES/2")
print(result.vars["RMS_0"])
print(result.varobjs.rms.RMS)

# Alternatively, the light curve can first be loaded into a light curve
# object, and the rms can be called on the light curve object.

lc = vt.LightCurve.from_file("EXAMPLES/2")
result = lc.rms()

# Or, if your light curve is stored in another object, like a
# numpy 2-d array, you can pass that instead.  
import numpy as np
lcarray = np.loadtxt('EXAMPLES/2')
result = vt.rms(lcarray)

When calling the vartools commands directly from vt the first argument can be a string giving the name of a file to process, it can be a VARTOOLS LightCurve object, a numpy array, a Pandas dataframe, or a tuple of time, magnitude and error values.

result.vars is a dictionary mapping output column names to their values. Individual statistics are accessible by key.

result.varobjs is an alternative option for accessing the statistics generated by each command in a structured manner.

Light curve file format

By default, VARTOOLS expects plain-text files with columns: BJD magnitude magnitude_error. Comment lines beginning with # are ignored. See the CLI reference for other supported formats and FITS input.

Output column index

The trailing _0 in RMS_0 is the command index — this RMS calculation is the first command (0-based index 1) in the pipeline. See Output structure below for full details.

2. Chaining Commands¶

Commands are processed in sequence, so filtering can be combined with analysis in a single call. Here, sigma clipping is applied before the Lomb-Scargle period search.

CLIPython

vartools -i EXAMPLES/2 -clip 5.0 1 -LS 0.5 5.0 0.1 1 0 -oneline

-clip 5.0 1 — iteratively remove points more than 5σ from the mean; repeat until convergence (1).
-LS 0.5 5.0 0.1 1 0 — Lomb-Scargle search from 0.5 to 5.0 days with a sub-sampling of 0.1 times the natural frequency stepsize (1/T, where T is the total time baseline of the light curve); return the top 1 peak; do not output the periodogram.

import pyvartools as vt

lc = vt.LightCurve.from_file("EXAMPLES/2")

result = lc.clip(sigclip=5.0).LS(0.5, 5.0, 0.1)
best_period = result.varobjs.LS.Period_1

Commands in the list are executed left to right. Each command receives the (possibly modified) light curve produced by the previous one.

The trailing _1 in LS.Period_1 indicates that this is the period of the top peak found in the periodogram. If searching for multiple peaks, additional period values are accessible.

3. Batch Processing¶

VARTOOLS is designed to process large sets of light curves efficiently. Pass a list file with -l on the CLI, or construct a 'Pipeline' and use run_batch / run_filelist in Python.

CLIPython

vartools -l EXAMPLES/lc_list -rms -clip 5.0 0 -rms

EXAMPLES/lc_list is a plain-text file containing one light curve path per line. The output contains one line per light curve, with the Name column identifying each file.

import pyvartools as vt
from pyvartools import commands as cmd
import glob

lcs = [vt.LightCurve.from_file(f) for f in glob.glob("EXAMPLES/[0-9]")]
pipeline = vt.Pipeline().rms().clip(sigclip=5.0).rms()
batch = pipeline.run_batch(lcs)
print(batch.vars)

# Or process files directly from disk (no Python I/O):
batch = pipeline.run_filelist("EXAMPLES/lc_list")

For efficient batch processing in Python, construct a Pipeline object — chain the commands you want to run as method calls on vt.Pipeline() (each returns the pipeline itself, so they read left-to-right), then pass it a list of light curves via run_batch or a list of paths via run_filelist.

run_filelist passes the file list directly to the vartools binary, which can be faster for large sets because Python does not load each light curve into memory.

Parallelism

For very large batches on the CLI, use the -parallel option. The Python run_filelist method accepts a nthreads argument for multiprocessing.

4. Output structure¶

Column naming convention¶

Every output statistic produced by VARTOOLS follows the pattern:

CommandName_StatName_PeakNum_CmdIndex

Field	Meaning
`CommandName`	The command that produced the statistic (e.g. `LS`, `RMS`, `BLS`)
`StatName`	The specific quantity (e.g. `Period`, `Power`, `SDE`)
`PeakNum`	For commands that return multiple peaks, the 1-based peak number
`CmdIndex`	The 0-based index of this command in the pipeline

Examples:

Column	Meaning
`RMS_0`	RMS from the first `-rms` command (index 0)
`LS_Period_1_0`	Period of the top peak from the first `-LS` command
`LS_Period_1_1`	Period of the top peak from the second command (index 1)
`LS_Period_2_0`	Period of the second-best peak from the first `-LS` command

Parsing `-oneline` output¶

With -oneline, the VARTOOLS CLI prints each output statistic on its own line.

vartools -i EXAMPLES/1 -rms -oneline

This produces output that can be more human-readable when there are many statistics computed.

5. Key Concepts¶

Pipeline model. VARTOOLS reads each light curve, applies commands left to right, and emits statistics. The light curve produced by one command becomes the input to the next.
Statistics vs. transformations. Some commands compute statistics about the light curve without changing it (-rms, -LS, -BLS, -stats, …). Others modify the in-memory light curve so subsequent commands see a different input (-clip, -fluxtomag, -harmonicfilter, …). A typical pipeline is a few transformations followed by one or more statistics commands.
Two output channels. The statistics table goes to stdout (or to the file given by -redirectstats). The (possibly modified) light curves are not written unless explicitly requested through -o (CLI) or capture_lc=True (pyvartools; capture_lc=True is the default option in this case).
Batch vs. single light curve processing. For the CLI use the -i input flag to process a single light curve, or the -l input flag to process a list of light curves. For pyvartools processing a single light curve produces a Result object, whereas processing a collection of light curves produces a BatchResult object.
pyvartools Result anatomy. A Result has .vars (the statistics row from this run), .lc (the captured light curve, when capture_lc=True), and .files (any auxiliary files a command wrote, e.g. periodograms or model files). BatchResult has the batch-shaped counterparts .vars (DataFrame, one row per LC), .lcs (list of LightCurve), and .files (dict of lists).
Parallel processing. Parallel processing of a batch of light curves can be achieved using the -parallel option for the CLI, or the nthreads keyword for pyvartools.
-readall flag. In most applications each light curve is processed independently by VARTOOLS in its own thread. The -readall option loads every light curve into memory first, then applies each command to the whole collection before moving on to the next. A handful of commands (-SYSREM, -ensemblerescalesig) operate on the full collection of light curves and can only run in this mode.
-oneline flag. Prints output in a human-readable one-statistic-per-line format instead of the default columnar table. Most useful when processing a single light curve or when debugging a pipeline interactively.
The Name column. The first column of every VARTOOLS output row identifies the light curve it came from. It serves as the primary key when joining VARTOOLS output with other catalogs.
Discovering commands. vartools -listcommands lists every command; vartools -help <cmd> prints full syntax for one command; vartools -example <cmd> shows a runnable example. See also Output structure for how statistics columns are named.

Getting Started¶

1. A first VARTOOLS run¶

2. Chaining Commands¶

3. Batch Processing¶

4. Output structure¶

Column naming convention¶

Parsing -oneline output¶

5. Key Concepts¶

Parsing `-oneline` output¶