APOGEE Radial Velocities

Calculation of Radial Velocities

The APOGEE radial velocities (RV) are derived in several steps:

  1. As each visit is reduced, an RV estimate is determined by cross-correlating the visit spectrum against a grid of synthetic spectra. This provides an “estimated RV” for the visit, which is stored in the apVisit files. These velocities are currently not used in subsequent reduction steps, but are saved (in fields named ESTVHELIO, etc.) for potential future use.
  2. Radial velocities for each visit are re-derived when the visit spectra are combined. This is done in three steps:
    1. For each visit, a relative radial velocity is iteratively calculated using the combined spectrum as the spectral template.
    2. An absolute radial velocity is calculated by comparing the combined spectrum against a grid of synthetic spectra spanning a large range of stellar parameters.
    3. The relative radial velocities for each visit and the absolute radial velocity are then used to calculate absolute velocities for all visit spectra.

This scheme was employed because RVs derived using a template made from the combined spectrum (i.e., of the star itself) should be more precise than RVs derived from a grid of synthetic spectra, none of which match the observed spectra perfectly. This procedure allows us to create a high-quality combined spectrum without even knowing what type of object with which we are dealing. However, the absolute RV is a critical science product and the final combined spectrum must be on the rest wavelength scale so that it can be properly compared to the large grid of synthetic spectra in the abundance pipeline (ASPCAP) and used for kinematic studies. This then requires the derivation of the absolute radial velocity of the combined spectrum against a grid of synthetic spectra (the “RV mini-grid”).

Preparing the Spectra

The spectra are “prepared” for cross-correlation by applying the following processes:

  1. Pixel masking. Pixels marked as “bad” in the mask array (usually, from bad pixels on the detector array), or those that have sky lines in the sky array are masked out for the rest of the RV determination.
  2. Continuum normalization. Each of the three chip spectra is normalized separately. The chip spectrum is separated into 40 chunks (covering approximately 14 ångstroms each) and the 95th percentile value is calculated for each chunk. A robust third-order polynomial is then fit to the chunk 95th percentile values. Finally, the spectrum is normalized (divided) by the polynomial fit.

The RV template spectra (observed combined or synthetic spectrum) are prepared in the same way as each of the visit spectra.


After the spectra are “prepared” by continuum normalizing, radial velocities are always determined by cross-correlating given spectra against a template spectrum. Both spectra are on the same logarithmic wavelength scale, i.e., a doppler shift is identical to a constant shift in the x-dimension. A Gaussian is fit to the peak of the cross-correlation function to determine more accurately the best spectral shift. Finally, the shift and its uncertainty are converted to velocity units.

Relative Radial Velocities

The relative radial velocities are determined by using the combined spectrum as the RV template spectrum. This is done iteratively. The relative RVs are determined first, and then the combined spectrum is created using the relative RVs to shift the visit spectra to a common (mean) velocity wavelength scale. For the first iteration, when no combined spectrum exists yet, the highest S/N visit spectrum is used as the template; low S/N spectra are smoothed in the first two iterations. For all subsequent iterations, the combined spectrum is used as the template. Each iteration finds small shifts in the shifted and resampled visit spectra compared to the combined spectrum, and makes adjustments until the values converge.

Absolute Radial Velocities

After accounting for the relative RVs in the visit spectra to create the combined spectrum, the latter still is at the mean RV of the star, which must be removed. The combined spectrum is cross-correlated against each synthetic spectrum in the “RV mini-grid”. The synthetic spectra have a resolution of 23,500, and are on the same logarithmically-spaced wavelength scale as the APOGEE combined spectra. For each synthetic spectrum, the best RV and χ2 of the observed spectrum are derived. The template spectrum yielding the lowest χ2 is chosen as the best-fitting spectrum, and cross correlation with this spectrum provides the absolute RV.

The RV mini-grid is composed of 538 synthetic spectra that span a large range of stellar parameters:

2700 < Teff < 30,000 K
0.0 < log g < 5.0
-2.5 < [Fe/H] < +0.5

Note, the step sizes and ranges for log g and [Fe/H] vary with effective temperature, but no interpolation is performed within the grid. A number of spectra with both high carbon and high alpha elements are included to help serve as templates for carbon-rich and oxygen-rich stars.

While the parameters for the RV template are stored in the summary data files and the FITS headers (RV_TEFF, RV_LOGG, RV_FEH, etc.), these values are not the best estimates of the stellar parameters of the stars , as the ASPCAP stellar parameter and abundance pipeline provides much more sophisticated results.

Synthetic Radial Velocities

After the best-fitting synthetic template is determined, each individual visit spectrum is cross-correlated against this template to derive “synthetic radial velocities.” We prefer the relative velocities derived from the cross-correlation of each visit with the combined spectrum (as described above) because they should be a better match that does not depend on the accuracy or completeness of the synthetic library, but this may not be true for objects with low S/N. The synthetic RVs provide a velocity check for objects where there is a good library match. The scatter across multiple observations between the two types of RVs is stored in SYNTHSCATTER. When this value is larger than 1 km/s, the SUSPECT_RV_COMBINATION bit is set in the APOGEE_STARFLAG bitmask.

Barycentric Correction

Radial velocities in APOGEE are reported with respect to the center of mass of the Solar System – the barycenter. The individual exposures are corrected for the relative motion of the Earth along the line-of-sight of the star during each observation. This is called the “barycentric correction”, and it can be calculated very accurately (to m/s levels). When these corrections are applied to the absolute RVs from above, we attain the RV with respect to the barycenter, or Vhelio for short.

Radial Velocity Uncertainties

The RV uncertainty depends on the S/N, the resolution, and the information in the spectral lines themselves. A spectrum with lots of deep and thin lines (such as in cool and metal-rich stars) will have a much more precise RV than a spectrum with a few shallow and wide lines (such as in hot stars). We can easily estimate the RV uncertainty in the APOGEE spectra by looking at the RV scatter for stars with multiple visits. The histogram of the RV scatter peaks at ~70 m/s (much less than our original survey target of 500 m/s), but it has a long tail at larger scatter. Much of this is due to real variability from stellar binaries. The observed scatter is stored in the VSCATTER parameter for each star, and it is probably the best indicator to use to determine whether a star is a binary (for stars with multiple visits). If VSCATTER > 1 km/s (i.e., much larger than the typical uncertainties), then it is likely a binary. Note, however, that for stars with a single visit, VSCATTER will be set to zero.

Saved Quantities

The barycentric radial velocities derived by cross-correlation of each visit with the combined spectrum corrected by the absolute velocity from cross-correlation of the combined spectrum with a synthetic spectrum is stored in VHELIO for each visit spectrum; an estimated error is stored in VERR.

For the combined spectrum, a signal-to-noise weighted average is stored in VHELIO_AVG and the scatter around this average is stored in VSCATTER. A S/N weighted pipeline-reported error is stored in VERR, and the median visit RV error is stored in VERR_MED; note, however, that these tend to be small, and VSCATTER may represent a better estimate of the true precision. Equivalent velocities, scatter, and errors derived by cross-correlation of each visit with the best-fitting synthetic spectrum are stored in SYNTHVHELIO, SYNTHVERR, SYNTHEVHELIO_AVG, SYNTHVERR, and SYNTHVERR_MED, respectively. The scatter between the two different RVs are stored in SYNTHSCATTER.