Getting Started¶
This page walks through a few simple VARTOOLS examples, showing both the CLI and Python (pyvartools) approaches side by side. All examples assume that the VARTOOLS source directory is the working directory, and that the EXAMPLES/ directory is present.
Installation
If you have not yet installed VARTOOLS, see the Installation page before continuing.
1. A first VARTOOLS run¶
The simplest example is the calculation of the root-mean-square (RMS) scatter of a light curve.
vartools -i EXAMPLES/2 -rms -oneline
# Output:
# Name = EXAMPLES/2
# Mean_Mag_0 = 10.12178
# RMS_0 = 0.05012
# ...
-i EXAMPLES/2— read a single light curve from the fileEXAMPLES/2.-rms— compute the RMS and mean magnitude.-oneline— print results in a human readable format, with one statistic per line, per light curve.
import pyvartools as vt
result = vt.rms("EXAMPLES/2")
print(result.vars["RMS_0"])
print(result.varobjs.rms.RMS)
# Alternatively, the light curve can first be loaded into a light curve
# object, and the rms can be called on the light curve object.
lc = vt.LightCurve.from_file("EXAMPLES/2")
result = lc.rms()
# Or, if your light curve is stored in another object, like a
# numpy 2-d array, you can pass that instead.
import numpy as np
lcarray = np.loadtxt('EXAMPLES/2')
result = vt.rms(lcarray)
When calling the vartools commands directly from vt the first argument can be a string giving the name of a file to process, it can be a VARTOOLS LightCurve object, a numpy array, a Pandas dataframe, or a tuple of time, magnitude and error values.
result.vars is a dictionary mapping output column names to their values. Individual statistics are accessible by key.
result.varobjs is an alternative option for accessing the statistics generated by each command in a structured manner.
Light curve file format
By default, VARTOOLS expects plain-text files with columns: BJD magnitude magnitude_error. Comment lines beginning with # are ignored. See the CLI reference for other supported formats and FITS input.
Output column index
The trailing _0 in RMS_0 is the command index — this RMS calculation is the first command (0-based index 1) in the pipeline. See Output structure below for full details.
2. Chaining Commands¶
Commands are processed in sequence, so filtering can be combined with analysis in a single call. Here, sigma clipping is applied before the Lomb-Scargle period search.
-clip 5.0 1— iteratively remove points more than 5σ from the mean; repeat until convergence (1).-LS 0.5 5.0 0.1 1 0— Lomb-Scargle search from 0.5 to 5.0 days with a sub-sampling of 0.1 times the natural frequency stepsize (1/T, where T is the total time baseline of the light curve); return the top 1 peak; do not output the periodogram.
import pyvartools as vt
lc = vt.LightCurve.from_file("EXAMPLES/2")
result = lc.clip(sigclip=5.0).LS(0.5, 5.0, 0.1)
best_period = result.varobjs.LS.Period_1
Commands in the list are executed left to right. Each command receives the (possibly modified) light curve produced by the previous one.
The trailing _1 in LS.Period_1 indicates that this is the period of the top peak found in the periodogram. If searching for multiple peaks, additional period values are accessible.
3. Batch Processing¶
VARTOOLS is designed to process large sets of light curves efficiently. Pass a list file with -l on the CLI, or construct a 'Pipeline' and use run_batch / run_filelist in Python.
EXAMPLES/lc_list is a plain-text file containing one light curve path per line. The output contains one line per light curve, with the Name column identifying each file.
import pyvartools as vt
from pyvartools import commands as cmd
import glob
lcs = [vt.LightCurve.from_file(f) for f in glob.glob("EXAMPLES/[0-9]")]
pipeline = vt.Pipeline().rms().clip(sigclip=5.0).rms()
batch = pipeline.run_batch(lcs)
print(batch.vars)
# Or process files directly from disk (no Python I/O):
batch = pipeline.run_filelist("EXAMPLES/lc_list")
For efficient batch processing in Python, construct a Pipeline object — chain the commands you want to run as method calls on vt.Pipeline() (each returns the pipeline itself, so they read left-to-right), then pass it a list of light curves via run_batch or a list of paths via run_filelist.
run_filelist passes the file list directly to the vartools binary, which can be faster for large sets because Python does not load each light curve into memory.
Parallelism
For very large batches on the CLI, use the -parallel option. The Python run_filelist method accepts a nthreads argument for multiprocessing.
4. Output structure¶
Column naming convention¶
Every output statistic produced by VARTOOLS follows the pattern:
| Field | Meaning |
|---|---|
CommandName |
The command that produced the statistic (e.g. LS, RMS, BLS) |
StatName |
The specific quantity (e.g. Period, Power, SDE) |
PeakNum |
For commands that return multiple peaks, the 1-based peak number |
CmdIndex |
The 0-based index of this command in the pipeline |
Examples:
| Column | Meaning |
|---|---|
RMS_0 |
RMS from the first -rms command (index 0) |
LS_Period_1_0 |
Period of the top peak from the first -LS command |
LS_Period_1_1 |
Period of the top peak from the second command (index 1) |
LS_Period_2_0 |
Period of the second-best peak from the first -LS command |
Parsing -oneline output¶
With -oneline, the VARTOOLS CLI prints each output statistic on its own line.
This produces output that can be more human-readable when there are many statistics computed.
5. Key Concepts¶
-
Pipeline model. VARTOOLS reads each light curve, applies commands left to right, and emits statistics. The light curve produced by one command becomes the input to the next.
-
Statistics vs. transformations. Some commands compute statistics about the light curve without changing it (
-rms,-LS,-BLS,-stats, …). Others modify the in-memory light curve so subsequent commands see a different input (-clip,-fluxtomag,-harmonicfilter, …). A typical pipeline is a few transformations followed by one or more statistics commands. -
Two output channels. The statistics table goes to stdout (or to the file given by
-redirectstats). The (possibly modified) light curves are not written unless explicitly requested through-o(CLI) orcapture_lc=True(pyvartools;capture_lc=Trueis the default option in this case). -
Batch vs. single light curve processing. For the CLI use the
-iinput flag to process a single light curve, or the-linput flag to process a list of light curves. For pyvartools processing a single light curve produces aResultobject, whereas processing a collection of light curves produces aBatchResultobject. -
pyvartools
Resultanatomy. AResulthas.vars(the statistics row from this run),.lc(the captured light curve, whencapture_lc=True), and.files(any auxiliary files a command wrote, e.g. periodograms or model files).BatchResulthas the batch-shaped counterparts.vars(DataFrame, one row per LC),.lcs(list ofLightCurve), and.files(dict of lists). -
Parallel processing. Parallel processing of a batch of light curves can be achieved using the
-paralleloption for the CLI, or thenthreadskeyword for pyvartools. -
-readallflag. In most applications each light curve is processed independently by VARTOOLS in its own thread. The-readalloption loads every light curve into memory first, then applies each command to the whole collection before moving on to the next. A handful of commands (-SYSREM,-ensemblerescalesig) operate on the full collection of light curves and can only run in this mode. -
-onelineflag. Prints output in a human-readable one-statistic-per-line format instead of the default columnar table. Most useful when processing a single light curve or when debugging a pipeline interactively. -
The
Namecolumn. The first column of every VARTOOLS output row identifies the light curve it came from. It serves as the primary key when joining VARTOOLS output with other catalogs. -
Discovering commands.
vartools -listcommandslists every command;vartools -help <cmd>prints full syntax for one command;vartools -example <cmd>shows a runnable example. See also Output structure for how statistics columns are named.