Contents:
The present document is not meant to be read end-to-end, but rather to be
used as a reference for those studying the web pages to understand the
quality of a given set of SDSS data. We have a summary of the
photometric quality
of each of the hundreds of thousands of 10'x13' image frames in the
SDSS data; this can be found here,
and described in detail here. To a very large
extent, the rest of this document, and the overall QA, is an expansion
and explanation of this summary quality.
There are two types of quality assurance carried out for each SDSS
imaging run: that produced by each pipeline (astrom, psp, frames,
and nfcalib; see the EDR paper for descriptions of each of these
pipelines, as well as the paper by Pier et al. (2003) for the astrometry) to diagnose immediate
problems with running the pipeline, and summary quality assurance on
the outputs, which is run after the data are completely reduced and
calibrated. (If you are interested in the detailed QA out of the
pipelines, not diagnosed here, you can find links to it at the index of QA
plots.) The
latter final quality assurance itself comes in two flavors:
This bulk of this document is devoted to a detailed description of the
Single-run QA and Overlap QA web pages. A description may also be
found in Ivezic et
al. 2004. But we start with a
description of overall assessments of SDSS data quality.
Each field has four principal colors determined from the locus of
stars in color-color space; these principal
colors are normalized to average to zero over the survey. The median value
of the principal colors in each camera column gives a measure of the
quality of the photometric calibration of that camera column.
In DR3, there are only ?? runs which are flagged as BAD by these
criteria, as the table at the top of the page shows. There are
separate criteria on the quality of each field
based on the seeing and the quality of the PSF determination, as
explained in detail below; the table also
indicates the fraction of fields in each run of different quality by
these criteria.
Based on these quantities, we assign a field quality for each field
as follows:
Note that it is a bug that only the s color is used in the
above; this will be fixed in a later release.
In the DR3, 58\% of fields are EXCELLENT, 26\% are GOOD, 13\% are
ACCEPTABLE, and only 3\% are BAD.
Note that the top of the web page indicates the versions of the
astrometric and photometric pipelines used to generate the data that
went into this analysis.
This web page gives statistics of a wide variety of internal
consistency statistics run on the data, with links to specific
postscript figures (and in some cases, tables of numbers) giving the
gory details. The tests carried out reflect specific concerns and
problems we've had with the data and pipelines thereof, and therefore
allow us to determine whether for these data, the problem does not
appear. This will become clear with the detailed descriptions of the
specific tests carried out. An SDSS run is divided into six scanlines
or camera columns, and each camera column is divided into a long
series of slightly overlapping fields of 10x13 arcmin2.
Each field, in turn, is made up of five frames, one in each of the
SDSS filters, u,g,r,i,z.
There is a condensed version of the QA for the run described below, but to understand it requires familiarity
with the basic tests of runQA, which we now describe.
The tests that are carried out on each run fall into six broad categories, which
are listed in the table of contents in the beginning of the web page.
Each of these colors has been offset to be zero over the full SDSS
survey. They have been normalized such that the error in the
principal colors is comparable to the error in a single band, assuming
the errors in each band are the same. These colors are measured for
every star brighter than r=19
(which are not flagged as having problematic photometry). The
statistics are plotted in a series of figures labelled Click here for the main s color
plot (and similarly for
w,
x, and
y).
They show the median and
rms statistics in each bin, as a function of field number, for each
camera column. Here one can see at a glance the extent to which the
stellar locus is in agreement with the SDSS global average (the lines
in the top panel stay close to zero), and whether there are specific
fields or groups of fields that deviate. One can also see whether the
width of the stellar locus stays constant. Also shown are the number
of stars used in each bin for the principal color determination (this
is usually not flat, as the density of stars changes with Galactic
latitude). Finally, the rms statistic in bins does not measure
the number of extreme outliers from the stellar locus, so also shown
is the number of stars more than 2 sigma from the median s, w, x, and
y values, on the blue side and red side, respectively.
In addition, these statistics are all
made available in the Field Quality Table , and also in links off
the flat-field error page linked at the bottom of the runQA page, and
are summarized in a series of
tables,
that look like this:
Color-color and color-magnitude diagrams are given in links to each camera column,
showing exactly the stars that go into the definition of each primary
color.
The quantities listed, for each camera column, include:
Too large a value of any of these quantities indicates a problem. If
any of these values is above a threshold, as indicated, it is shown in
red, so problems can be recognized at a
glance. If a problem is indicated, often the best way to proceed is
to look at the principal color plots. For example, in run 1478, the
Median x color is about -0.013 (i.e., 13 millimags) in most columns, a
bit above the nominal threshold of 0.01. The x
color plot does indeed show an offset in the x color of about 1%,
which seems to be a real calibration problem.
Thus the variation of the sky brightness with time has been taken out;
these curves should be quite flat (as indeed they are). In addition,
the median of each of these curves has also been taken out, so
each should hug zero.
Summary statistics are then given of the quantities in that figure,
including medians (i.e., the quantity taken out of the graphs
above) in the table Phot. Zeropoint Corr. for
Flat Sky, rms (Sky: RMS), and max
deviation (Sky: Max Dev)statistics. Note that each
of these tables links to the same figure for each of the camera
columns.
As this example shows, this is a very powerful statistic, and checks
for consistency of the internal photometric calibration and the
flat-fields at the level of much better
than a percent. It is limited, however, by the presence of scattered
light, especially in the u band. An interesting scientific question
is whether the sky
brightness is expected to differ at this level over the 2.5 degree
field of the camera.
In particular, the PSF magnitudes of stars (i.e., that determined from
a direct fit of the PSF model to the brightness, aperture corrected to
a large aperture; see the EDR paper) should agree with the
large-aperture (here, a diameter of 7.5 arcsec) magnitude itself.
This comparison can be carried out only for bright stars, for which
the noise from the sky is negligible. This is carried out for each
column of data for each filter, and plots (click here
for an example) are given showing the
difference between aperture and PSF magnitude (for objects believed to
be stars, brighter than 19th magnitude, and with estimated PSF
error less than 0.05, in each band considered). Each
star is shown as a point, with a running median per field shown as a
red line.
The Postage Stamp Pipeline, which determines the PSF in the first
place, indicates for each field the quality of its determination of
the PSF. There are certainly fields in which the PSF is known to be
poorly determined (usually in regions where the PSF is changing
rapidly with time; note the bottom panels give the r-band seeing and
its derivative); changes in seeing of 0.2 arcsec/field are definitely
bad! These fields are flagged as having suspect PSF, and are
indicated as such with cyan or magenta lines in the plot. These fields
are not included in the summary statistics now described.
These summary statistics consist of:
In run 1478, the median PSF-aperture magnitude hugs zero impressively
well, with essentially no offset, and an rms scatter of 0.01 mag in u
and z, and substantially better in g, r, and i. Two chips (out of
30!), the u-band in columns 2 and 5, are flagged red in their median
offset. As an aperture correction is made to make the PSF and
aperture magnitudes agree in the mean, this test is more a consistency
check than anything else.
In addition to the PSF, every object is fit to exponential or de
Vaucouleurs profile. The better-fitting of these two yields a
so-called `model magnitude'. This quantity is aperture-corrected to
force the model and PSF magnitudes of stars to agree. Of course, for
stars, the model scalesizes are close to zero, but this has really all
worked properly with the latest version of the photometric pipeline,
as described in detail in the SDSS
DR2 paper. The consistency of model and PSF magnitudes for stars
is is tested
in the next series of plots and tables, m(mod)-m(psf): Max Dev/Med Off/RMS
.
In particular, the plots
under band-to-band astrometric accuracy show the measured
distribution of this effective positional uncertainty (in the row and
column directions), shown both linearly (upper panels) and
logarithmically (middle panels; note the change in scale on the
x-axis). The median and sigma (as measured from the interquartile
range) are given for each panel. Note the pesky periodic peaks in the
upper histograms; those are an artifact of round-off errors in the way
the data are stored.
The distribution is roughly Gaussian. One wants to look at outliers
in this plot, for example to look for asteroids in the outer solar
system. The T3 and T5
statistics shown in the figure give the number of 3-sigma and 5-sigma
outliers relative to the number expected in a purely Gaussian
distribution, respectively. T3 is typically 5-10, while T5 is
meaningless for all but the very longest runs. But note that the
distribution cuts off completely at +/- 100 milliarcsec; the
internal astrometry is very clean!
If the astrometric errors are estimated correctly, the distribution of
the measured errors divided by the errors estimated by
the imaging pipelines will
be a Gaussian with variance unity. This is tested in the lower panel,
Relative Astrometry by Field
(for blue and red stars separately). The distributions are not too
far off from Gaussian, and indeed have a variance significantly less
than unity, implying that the estimated astrometric errors are biased
somewhat high.
All these statistics are summarized in the table shown, for row
and column statistics separately. All quantities are shown in units
of milliarcseconds (except for the chi-squared statistics, which of
course are unitless).
One can then study these statistics as a function of field. The
plot
(here for column 1) shows the internal astrometric offsets for each
star brighter than r=19, together with median per field. The
statistics for these quantities (median over the fields, maximum, and
rms) are listed in the table. The maximum over a field are a few tens
of milliarcsec, with an rms of only a few milliarcsec.
The SDSS images are from a drift-scan, thus the flat field is a
one-dimensional vector for each chip, as a function of pixel column.
If this flat field is in error, the error will propagate into all
derived quantities, especially the photometry. The Principal Colors,
described above, can be used to check the flat
fields. The stellar principal colors are determined for each camera
column as a function of pixel column (in bins of 32 pixels).
Systematics as a function of pixel column are an indication of
flat-field problems. There are four principal colors, and five
flat-fields; the equations are closed by including constraints from
the PT. The results of solving for the
possible correction in each of the u, g, r, i, and z flat-fields is
shown in a series of plots (such as this
one). Again, these tend to be noisy for a single run, especially
in u (where we've been having the most trouble with the flats). The
largest problems in the flats tend to be at their edges. This
information is used a posteriori to improve the SDSS flat-fields
themselves.
Note that the links off these tables are dummies.
Following that is a table with mean statistics over the run of various
additional quantities associated with the PSF determination and the
relative astrometry between bands. These of course are taken simply
from the detailed QA we've already seen.
For each such pair of runs, there are in principle a total of 6x6 = 36
pairs of overlapping scanlines. One gets this maximum number in the
extreme case of one run perpendicular to another (only done under
special circumstances); in the more common case of two strips of a
stripe, one will get 11 overlaps (see, for example, the matching of
2190 with 1478). The various tables described below have an entry for
each of these scanline overlaps. The quantities in tables are as
follows:
The first table on the QA page summarizes the number tabulated in the
first column of plots, namely the mean offset. There are two entries
highlighted as worrisome here, namely the u-band offsets in the
col3-col2 overlap and the col6-col6 overlap. In this case, these
discrepancies are indeed substantially larger than we would like. The
overlap here is of course using stars at the edges of the chips, which
tend to be the ones most affected by problems in the flat fields
(which is particularly difficult to tie down in u) and in the PSF
(also difficult in u, given the paucity of relevant stars).
The next question to ask is whether these offsets are uniform as a
function of time (i.e., along the run). The next table and series of plots give statistics
on the offset as a function of field number. Look for example at
this
plot; for each band, each point is a star as a function of the
field number of run 1 (here 2190), and the line is the median on
4-field bins, with summary statistics (median, sigma, and maximum
offset) listed on the right. The offset tabulated in the u-band is
definitely real, although it is based on a relatively small number of
stars. Note the bottom panel, which shows the r-band PSF width for
the two runs in question; photometric excursions are occasionally
associated with regions of poor seeing (or more accurately, high
seeing derivative, where the PSF becomes difficult to follow).
The table labelled, Photometric Zeropoint - Max Dev. vs. Field
tabulates the maximum offset between the four-field-binned offset as
shown in the line in the plot, in each band, and in each overlap.
These statistics have been derived excluding fields with known
problems (most commonly, problems in determining the PSF), but a
smattering of red values is not terribly uncommon, and may represent
problems confined to a frame or a few frames. More important is the
mean offset in the overlap, which was included in the first table on
this web page.
The next table gives the rms value in the four-field-binned offsets;
it is the scatter around the mean value. In this example, there is
not a single red entry; the offsets that do exist are all pretty
constant all the way across (less than 2%). Note that the
corresponding figures are the same as in the previous table.
The quantities we've just looked at have all been based on PSF
magnitudes, and thus test the modelling of the PSF, as well as
flat-field and calibration problems. To try to isolate the latter
effects, we now look at 7'' radius aperture magnitudes, in
particular, the offset as a function of field number. The plots are
exactly analogous to those we just discussed for PSF magnitudes. This
comparison is
restricted to bright stars, and is somewhat noisy; it is only useful
for looking for gross errors. Note that only the sigma of the
4-field-binned offset is included in the table
Median Scatter with Aperture magnitudes; all relevant quantities are
of course included in the figures.
As hinted at above, errors in the flat-field can give rise to offsets
in the photometry in adjacent runs. One can explore this by examining
the offset in PSF magnitudes as a function of CCD column.
This is the quantity explored in the table headed Flatfield test: Max Dev. vs. Colc. The figures
show this for each star, and binned in 64 pixel bins. The serious
discrepancies are mostly in the u band (which again is most difficult
to flat-field). In practice, most adjacent runs don't overlap very
much, so this isn't as informative as one might hope...
Some of the errors in magnitudes that are explored in the overlap
tests thus far may be correlated between bands. An interesting
question is how colors are affected. Model magnitudes in
particular are designed specifically as an optimal measure of color;
this is explored in the tables entitled Color
Scatter: Stars - mod, RMS and Color
Scatter: Galaxies - mod, RMS, for galaxies and stars
respectively. The plots show the familiar differences as a function
of field, for u-g, g-r, r-i, i-z, and g-i, and the tables show the RMS
values over all objects in each case (the figures further tabulate
statistics on the binned quantities with field). Note that there are
several entries of exactly zero in u-g, which are caused by the
absence of any galaxies bright enough in u and g to carry out the
comparison. There are red values in i-z in all overlaps in this case;
they show a somewhat larger scatter than the 0.04 magnitude threshold
here; we'll see in a moment that this threshold is probably set too low. Note that the binned quantities
are quite a bit better behaved.
Next follows further statistics on repeatability of photometry in
various quantities. We had already seen plots of the repeatability of
PSF magnitudes; the rms values coming from those plots are tabulated
in the table Photom. Acc. for Stars (m(psf)<19)
. One can make the same comparison for model magnitudes of
stars (Photom. Acc. for Stars (m(model)<19)), and
aperture magnitudes of stars
(Photom. Acc. for Stars (m(apt)<19)), Petrosian
magnitudes of galaxies (Photom. Acc. for Galaxies (m(petro)<19)), model
magnitudes for galaxies (Photom. Acc. for Galaxies (m(model)<19)). In the
last of these, note that the intrinsic scatter of i and z model
magnitudes is quite large (and consistent within several tens of
percent with the estimated errors), thus the red entries in i-z for
the model colors are probably using an unrealistically small
threshhold.
We have seen that we can test the estimate of the errors out of the
pipeline with chi^2 statistics. The results of this is tabulated for
PSF magnitudes of all stars (delmag/err (stars, psf mag)
), PSF magnitudes of stars brighter than 19th magnitude (delmag/err (stars, m(psf)<19)), model magnitudes
of stars (delmag/err (stars, mod mag)), and Petrosian
magnitudes of galaxies (delmag/err (galaxies, pet mag)). For the PSF and
Petrosian magnitudes, the errors are seen to be correct to better than
20% (much better in some cases); the model magnitude errors are
somewhat worse.
We now look at comparison of astrometry in the overlaps. The Overall astrometry table lists statistics along
(mu) and perpendicular (nu) to the scan direction. The plots show
the distribution of differences of positions of stars (black) and
galaxies (blue), (both brighter than 20th magnitude in r) with median values and
effective sigmas (divided by root two) written on the side.
The width of the stellar distribution is smaller, not surprisingly,
and is of the order of 40 milli-arcsec in each coordinate. These
values (the median offset, and the widths for galaxies and stars), are
given separately for mu and nu in the tables.
The next table, Astrometry by field, looks
for trends in the astrometric offsets with field number. The plots
show the difference in mu and nu positions of stars brighter than
r=20 as a function of field number, together with the four-field
binned median. Statistics are given on this running median: maximum offset,
sigma, and overall median, all in units of arcseconds; these are
included in the table. Structure in this figure is believed to be
due largely to anomalous refraction, which is coherent across the
camera. For those runs reduced against the UCAC astrometric catalog,
the density of astrometric standards on the sky is enough to take out
most of this structure, and the worst deviations on a four-field
median are still no worse than 100 milli-arcseconds. The figures
also give the seeing in the two runs for comparison, as well as the
numbers of stars used in the comparison.
A final comparison is listed under the table labelled Miscellaneous, and includes a comparison of
star-galaxy separation in the overlap. The figures (the lower two
panels of the figures showing the astrometric scatter) show on a
log-scale, the numbers of objects whose classification (star/galaxy)
is consistent and inconsistent in the overlap. The right-hand panel
shows on a linear scale the relative number of stars, galaxies, and
mismatches; one can read from this the r magnitude, e.g., where 90% of
the objects are classified correctly. This quantity is shown in the
figure, and tabulated in the table, together with the 95%
classification level as well.
The quantity called completeness looks for the magnitude at which the
total counts appear to turn over. This is a bit misleading, because
the right way to calculate completeness is to compare with a deeper
imaging catalog. This shows a completeness substantially fainter than
the numbers shown here, of order 22.5 for the magnitude in which 50%
of the objects are recovered.
The final column of the table is a measure of the slope of the galaxy
counts; it is the ratio of the number of galaxies between 19-20, and
between 18-20, in r. This should be between 0.7 and 0.9, but will be
affected by large-scale structure.
Finally, note a link at the bottom to a summary
table, which turns each of the myriad tables described above into
a single number per band. This does allow one to see at a glance the
qualitative overall sense of the comparison. We recommend that this
be used only qualitatively; to get a real feel for what's going on,
one needs to look at the full tables, and figures where appropriate.
Overview of Quality Assurance
In order to assess the overall quality of the SDSS imaging data, each
SDSS imaging run goes through a lengthy series of tests to confirm its
overall photometric and astrometric quality. This document describes
these tests in detail, in particular the detailed web pages that are
made summarizing the results. These quality assurance tests have
developed over the course of the survey, and have several goals:
Summary QA quantities
The SDSS imaging data are taken as a series of runs (typically
indicated by a four-digit number). Each run is made up of six
parallel scanlines, and each scanline is divided up further
into 10'x13' frames. Global photometric quality assurance is
done both on individual frames, and on each run. These global
quantities are based on the detailed quality assurance tests carried
out in Single Run QA.
Summary Run quality assurance
This is summarized in the third table in
a summary of the SDSS run quality assurance results. This
includes links to the single-run QA for each of the imaging runs.
The following is a description of the method used to assign an overall
quality to the run.
Summary Field quality assurance
At a finer level, the quality of each of the hundreds of thousands of
fields in the SDSS imaging data is assessed, and is summarized in a very
large ascii file. The following description of the assignment of
quality flag is taken straight from the SDSS DR3 paper.
This flag is based on five attributes:
Single-run Quality Assurance
Single-run QA is the first place to look to understand in detail the
quality for a given run. This explanation of the web page will use an
example run (chosen more or less at random), run 1478,
rerun 40, which covers a part of strip 12 S. It can be reached from
the summary
web site); clicking on the run number gets us to
http://www-sdss.fnal.gov/sdssdp/opdb/1478/40/qa/all-runQA-1478-40.html.
Determining the overall quality of each field
The table lists the fractions of fields in each camera column with
each quality classification as defined above. In the run in question, the vast
majority of
fields are good or excellent. Tables of relevant numbers are given
for each camera column in links in the first column of the table, and
(in even more excruciating detail) in the link labelled Here is the QAwizhard's Field Quality Table.
The Principal Colors of the Stellar Locus
The distribution of ordinary stars in SDSS color-color space follows a
tight locus with distinctive features that can be used as measures of
the photometric calibration. In particular, one can use various
regions of these stellar loci that appear straight to define a series
of principal colors, whose median values should be essentially
constant over the survey (although they are weakly
metallicity-dependent). See Helmi et al (2003, ApJ, 586, 195) for
an initial description of these principal colors. These colors are
defined after correcting for foreground
reddening a la Schlegel, Finkbeiner, and Davis (1998); to a very good
approximation, at the flux limits of the SDSS, all stars are beyond
the absorbing dust. There are four such
principal colors defined:
The x color Quantity: Median Max|PC| Rms Width MaxWidth Nall NblueTail NredTail
col. 1 -0.013* 0.026 0.009 0.041 1.28 30 1 0
col. 2 -0.008 0.028 0.011 0.042 1.30 28 1 0
col. 3 -0.015* 0.039 0.012 0.041 1.37 30 1 0
col. 4 -0.013* 0.032 0.011 0.041 1.41 28 1 0
col. 5 -0.014* 0.024 0.012 0.042 1.33 29 1 1
col. 6 -0.012* 0.017 0.008 0.040 1.38 30 1 0
mean value -0.012* 0.028 0.011 0.041 1.345 29.167 1.000 0.167
*Red entries are triggered by entry < -0.01 0 0.0 0.0 0 0 0 0
or entry > 0.01 0.04 0.02 0.08 1.75 100 30 30
Background (Sky) Brightness Analysis
Another check of the uniformity of the photometric calibration, from
one camera column to another, is the uniformity of the sky brightness.
The sky brightness, measured on a frame-by-frame basis, changes with
time; here we calculate for each filter a
running median as a function of field over the six camera columns.
The plot http://www-sdss.fnal.gov/sdssdp/opdb/1478/40/qa/runQA-1478-40-sky.ps,
shows the deviations of each camera column from this median, in
units of magnitudes per square arcsecond ??.
Note that the bandpasses of the z chips differ slightly, one
from another; this effect is corrected for before taking the medians.
PSF Photometry
Errors in the determination of the Point Spread Function (PSF) have
been one of the most pernicious in the development of the photometric
pipeline. The following tests allow us to determine how well the PSF
is calculated, by comparing different measures of the brightnesses of
stars.
Internal Tests of Astrometry
The astrometric calibration is applied to the r-band, and then
propagated to each of the other bands in turn. One can ask, on a
star-by-star basis, whether the calibrated positions are consistent.
This is shown in the last two sets of plots and tables. The
photometric pipeline measures a "velocity" for each object. This is
done with asteroids in mind: a main-belt asteroid has an apparent
proper motion of a few arcseconds over the five-minute interval
between the r-band image and the g-band image. Thus the astrometric
position of the asteroid varies linearly with filter (in the order,
riuzg). The pipeline fits a line to the positions of
every object; for all real stars, this line has a slope
consistent with zero. The scatter in this slope can be expressed as a
relative positional uncertainty between bands in arcseconds; these
tables collect the statistics for bright stars (r<19.0).
Checks of the flat fields
Note that the results of these tests are given on a separate
web page, as they are less useful on a run-by-run basis. Also given
on this web page are links to the data files containing detailed
information on the principal colors.
Summary Table
At both the top and the bottom of the runQA page, there is a link to a
summary
table, which condenses all this information further. The first
table is a repeat of the the
Phot. Zeropoint Corr. for Flat Sky, giving estimates from the sky levels
of any photometric offsets of each chip. A second such estimate is
given from the principal colors in the table
entitled Zeropoint Corrections to Fix Principal
Colors. Here, the four principal colors for each chip
(median'ed over the run) are solved for the five passbands, under the
(arbitrary) assumption that the offsets in g, r, and i (the three
highest S/N bands) add to zero. Remember, the principal colors are
defined to have zero mean over the full survey, and are normalized to
have errors comparable to the error in a single band. Because the
u-band enters into the principal colors only through u, where it has a
coefficient of 1/4, this process tends to "push" the offsets to u;
there will often be offsets in u that are somewhat above spec as a
consequence.
Overlap QA
There is another summary of QA web pages, called the index of QA
plots. It includes links to the single-run QA, as well as the QA
for the various pipelines. Here we will use it to look at the
overlap QA.
Any given run is likely to overlap adjacent
runs (the paired strip of the stripe, the adjacent stripe, and/or
multiple runs covering a given area of sky, as happens in the Southern
stripe). For run 1478, there are two overlapping runs: 2190
and
2247.
Comparing results in the overlap allows us to make a direct
comparison of the photometric outputs, including the errors.
Comparison of PSF magnitudes
There is a series of statistics calculated on the distribution of
differences of PSF magnitudes of stars calculated on the pair of
runs. Bringing up any of the plots clickable on the first column
tells the full story. For example, look at the plot here.
There are five rows of plots, for u, g, r, i, and z respectively. In
each row, there are four plots. They are:
Michael Strauss, August 27, 2004, revised December 31, 2004