Using APOGEE Spectra

There are several important aspects relevant to all APOGEE spectra that you should be aware of as you examine and use the spectra. See the Table of Contents for important APOGEE parameters and aspects discussed on this page.

Types of APOGEE Spectra

Several different types of APOGEE spectra are available:

Combined spectra:
from all visits to a star are available in apStar files.
Individual visit spectra:
of each visit to each star are available in apVisit files.
Pseudo-continuum normalized spectra:
that are used in the derivation of stellar parameters are available in aspcapStar files.

The construction of these files is described in other locations as linked above.

The combined spectra (apStar files) may be the most useful of these. These combined spectra are generated by resampling individual visits onto a common, logarithmically-spaced wavelength scale (log λi+1 – log λi = 6E-6, with a common starting wavelength of 15100.802 Angstroms), after removing each visit’s derived radial velocity; thus the resulting spectra are in rest, vacuum wavelengths. Data from the entire APOGEE wavelength range (which includes some gaps. See below) are included in a single array. The wavelength scale is recorded in the header in standard FITS cards; thus, standard software should allow, e.g., straighforward plotting of flux vs. wavelength.

The apStar files also include the individual visit spectra, resampled and shifted to rest wavelength.

The individual visit spectra are also available before resampling in apVisit files. Note that while these have wavelength calibration information, the native wavelength scale is not an evenly spaced linear or logarithmic scale. The wavelength information is included as a separate wavelength array, and also in a table that gives the parameters of the function used to fit the pixel-wavelength relation, but this information is not incorporated in a way that standard software will be able to plot wavelengths from this information. For the apVisit files, spectra from each of the three chips are stored in different rows in the image extensions.

The spectra can be downloaded from the Science Archive Server (SAS) as described on the data access page.

Data Quality Flags

Information about the data quality of APOGEE spectra is encoded in several different bitmasks that are included with the spectra.

  • At the individual pixel level, the visit and combined spectra include a mask array in HDU3 in the apVisit and apStar files, with bits set according to the APOGEE_PIXMASK bitmask. This bitmask flags both bad pixels and “warning” pixels; data in bad pixels are definitely unreliable, while data in “warning” pixels may be unreliable.
  • At the individual visit level, information is included in a APOGEE_STARFLAG bitmask that is recorded in the FITS header as card STARFLAG.
  • At the combined spectrum level, there is a bitwise OR and a bitwise AND of the APOGEE_PIXMASK bitmasks from the individual visits that is output in HDU3 in the apStar files (rows 1 and 2), as well as a bitwise OR and a bitwise AND of the APOGEE_STARFLAG bitmasks from the individual visits that are recorded in the apStar headers in the STARFLAG and ANDFLAG cards.
  • If you inspect the bitmasks you will see that data in some locations may not be reliable; some of the specific reasons for this are discussed below. In many cases unreliable pixels may come in “chunks” of contiguous pixels; this can arise even if only a single pixel has bad data, because the combination of dithered spectra and spectra from multiple visits requires that values in the final spectra have contributions from multiple pixels in the raw input spectra. If a bad pixel is expected to have a significant contribution to any pixel in the final spectrum, then that final pixel will be flagged.

Vacuum Wavelengths

The wavelength calibration of the APOGEE data is done using vacuum wavelengths. However, the wavelengths of atomic transitions are usually quoted at standard temperature and pressure (S.T.P.); this is how the CRC Handbook of Chemistry and Physics lists them for transitions redward of 2000 Ångstroms. Thus, recognizing spectral lines associated with specific atomic transitions may require converting the SDSS data to the equivalent values at S.T.P.  For APOGEE data, we have used the conversion from Ciddor (Applied Optics, Vol 35, p 1566, 1996) to convert between vacuum and air wavelengths. For a vacuum wavelength (VAC) in Ångstroms, convert to air wavelength (AIR) using the equation:

AIR = VAC / (1.0 +  5.792105E-2/(238.0185E0 - (1.E4/VAC)^2) + 1.67917E-3/( 57.362E0 - (1.E4/VAC)^2)

Wavelength Coverage and Detector Gaps

The APOGEE spectra are recorded onto three different detectors (“chips”). While the overall coverage ranges from 1.514 to 1.696 microns, there are small gaps between the detectors, leading to gaps in the wavelength coverage. While all of the APOGEE spectra lie in the infrared H band, we sometimes refer to the chips as the “blue”, “green”, and “red” chips, going from the shorter wavelengths to longer wavelengths. Data products refer to the separate chips as chips “a”, “b”, and “c”, in the order in which they are read out. As it turns out, the “red” chip is the first one to read out, so this nomenclature is in reverse wavelength order. The following table explains the terminology.

chip name start wavelength end wavelength central dispersion
a “red” 1.647 μm 1.696 μm -0.236 A/pix
b “green” 1.585 μm 1.644 μm -0.283 A/pix
c “blue” 1.514 μm 1.581 μm -0.326 A/pix

Note that the starting and ending wavelengths vary slightly from fiber to fiber because of variations of their placement along the instrument pseudo-slit. The dispersion varies with wavelength, and, to a lesser extent, with fiber.

Imperfect Subtraction of Night Sky Lines

The night sky lines (i.e., “airglow”), primarily from OH emission in the Earth’s atmosphere, can be extremely bright. In the current version of the pipeline a combination of the emission from sky fibers at sky positions near those of each target is subtracted from the spectra of the targets. However, this subtraction is almost always imperfect because (1) the sky spectra need to be shifted in wavelength to match the object spectra, because of variations of locations of the fibers along the pseudo-slit, and (2) the line spread function (LSF) of different fibers varies because of changes in image quality across the spectrograph camera field-of-view. Because the night sky lines are so bright, even small fractional variations due to these issues can cause the subtraction to be very noticeably imperfect; thus most sky lines are either under- or over-subtracted.

We note that, even if the airglow subtraction were perfect, the area of the spectrum “under” the sky lines would be of significantly lower signal-to-noise, because of the large Poisson contribution from the bright lines. Partly because of this, we have not yet put significant effort into improving the sky subtraction. Additional work along these lines may be made for subsequent data releases.

The imperfect sky subraction does have the unfortunate result of making the APOGEE spectra appear a bit “ugly” to a quick, casual inspection. The APOGEE data products (e.g., apVisit and apStar files) include a record of the sky spectrum that was subtracted, and it is possible to use this as a guide to recognizing pixels that are likely to be affected by imperfect sky subtraction.

Error Arrays

All APOGEE spectra include an array of uncertainties (“errors”) for each pixel. These uncertainties are initially calculated from the raw pixel data based on the inherent properties of the detectors (gain and readout noise). These raw errors are propagated into subsequent data products.

However, in downstream spectral products, data in any given pixel may have been derived from some combination of pixels in the raw data, and data from any individual raw pixel may contribute to more than one pixel in the combined spectra, leading to correlated errors between pixels. This can occur in visit spectra because these are combined from two separate dithered observations. If dithers are exactly spaced by 0.5 pixels, then the combined spectra just interleaves the two dithered exposures, but if the dithers are slightly imperfect (as they generally are), any pixel in the combined well-sampled spectrum will have contributions from multiple raw pixels. For the visit-combined apStar spectra, the pixels definitely have contributions from multiple raw pixels, because the apStar spectra are RV-corrected and resampled onto a common wavelength grid. Although the uncertainties are propagated into the apVisit and apStar spectra, this propagation ignores the correlation of uncertainties that results from having processed pixels that are derived from multiple raw pixels.

Multiple observations of selected targets have been used to estimate empirical uncertainties, and these demonstrate that, for most targets, the calculated uncertainties are reasonable, i.e., the scatter from observation to observation is comparable to the estimated uncertainty in each observation. However, for very bright targets the calculated uncertainties are almost certainly an underestimate, because the accuracy of these data are most likely limited by systematic errors in the data processing and calibration data products. We have not yet fully quantified these, but we think it is likely that there is an uncertainty “floor” around the 0.5% level, i.e., a maximum S/N of ~200. We have not set such a floor in the spectrum uncertainty arrays, so users need to beware that there is a likely maximum S/N~200.

Bad Pixels/Missing Regions

The IR detectors are not cosmetically perfect, leading to small regions of each chip that are bad, as well as a significant number of bad or “hot” pixels. These are flagged during the data processing, and can lead to bad or missing regions in any given spectrum. Because visit spectra are combined from multiple individual dithered spectra, a single bad pixel can propagate into multiple pixels in the visit-combined spectra. These can have the effect, along with poorly subtracted sky lines, of making individual visit spectra look rather “ugly”. The mask arrays can be used to identify the cause of most bad pixels.

Because any given star will typically not use the same fiber in different visits, and especially if the observed radial velocity (including differences in barycentric RV) of a target differs significantly from visit-to-visit, combined spectra generally look somewhat cleaner. However, even if the combined spectra do not have regions with data missing, there may be regions where the noise level is elevated if that portion of the spectrum landed on a bad region of one of the arrays in one or more of the visits.


The use of VPH gratings results in the production of some “ghosts” on the 2-D images. The most prominent of these is the “Littrow ghost”, which for APOGEE falls somewhere in the wavelength region 1.624 to 1.626 microns, depending on the fiber.

The amplitude of the ghost depends on the brightness of other stars in the field, so it does not always contribute a significant amount of flux. Pixels possibly affected by the Littrow ghost are flagged with the LITTROW_GHOST bit in the APOGEE_PIXMASK bit mask.

Fiber Cross Talk

To pack the spectra of as many stars as possible across the APOGEE detectors, the spacing between adjacent spectra is relatively small, amounting to ~ 6.5 pixels between adjacent PSF peaks. Therefore, the wings of the PSF overlap slightly with adjacent spectra, in particular if a faint object is located adjacent to a much brighter object. To mitigate this, the targets on each plate are sorted into three brightness categories — bright (B), medium (M), and faint (F) — and these categories are placed along the pseudo-slit (and hence, on the detectors) in the order FMBBMF FMBBMF …, so, in principle, a faint object (or sky) should never land next to a much brighter object. However, the magnitude ranges of these brightness categories can be broad, and make it possible for objects of significantly different brightness to have spectra adjacent to one another.

The extraction portion of the data reduction pipeline accounts for contributions of light from the two adjacent spectra for each object. However, the quality of this extraction depends on a high-quality knowledge of the amplitude of the wings of the light distribution. In cases where adjacent targets are significantly brighter than an object, small inaccuracies in the PSF model may lead to significant errors in the extraction of adjacent spectra.

For each visit, a bit is set in the APOGEE_STARFLAG bitmask if an adjacent object is more than 100 times brighter than the star (VERY_BRIGHT_NEIGHBOR) or more than 10 times brighter (BRIGHT_NEIGHBOR). The former case, which is rare, is automatically considered as a bad spectrum, i.e., not to be included in the combined spectrum.

Incomplete Data Acquisition

DR12 contains the complete collection of all APOGEE data that have been collected before August 2014 (including the complete APOGEE data set from SDSS-III). The full complement of visits, leading to a net S/N > 100, were successfully acquired for the majority of APOGEE targets. However, for a small number of APOGEE fields, the full set of visits were not completed. Therefore, the S/N for the combined spectra of some stars in these fields (generally the faintest ones), may not achieve the survey goal of S/N > 100.

Persistence in the "Blue" Detector

Some areas of the detectors used in the APOGEE instrument suffer from a problem that we refer to as “superpersistence”. In these locations on the detector, previous exposure to light causes a glow in subsequent images that can be very significant and last for a significant amount of time. The problem is most severe on about 1/3 of the “blue” chip, i.e. the chip that records wavelengths between 1.514 and 1.581 microns. The orientation of the chip is such that this region affects essentially all of this wavelength region for the 1/3 of the fibers that put light into this area. There are also regions in the “green”chip that are affected by a lower level of superpersistence, but these are not so cleanly delineated by fiber or wavelength.

The effect of superpersistence depends on the prior exposure history for each fiber, and likely on the brightness of the target being recorded. Some level of mitigation is provided by the fiber management system described above, because the grouping of fibers according to target brightness makes it relatively uncommon for a faint target to be observed through a fiber that was previously placed on a bright target. However, because the magnitude ranges that define these categories are broad, there can still be cases where faint targets follow relatively brighter ones. In addition, calibration flat field exposures are taken between every plate to map the distribution of light between fibers and to measure the fiber-to-fiber throughput variations, and these roughly evenly-illuminated frames give rise to some superpersistence.

Superpersistence is a complex phenomenon, and in DR12 we have no implemented any correction for it. Subsequent data releases may attempt to incorporate some kind of correction.

The effect of superpersistence can be very significant and easily noticed: the flux levels in the region of the spectrum affected can be enhanced by tens of percent or more. This enhancement is likely to have some wavelength dependence, so spectral features might be distorted. However, depending on the brightness of the target and preceding ones, it is not guaranteed that the spectra are significantly adversely affected, so we do not immediately call all data that falls in the persistence region bad.

In the data reduction pipeline, we flag all pixels in the APOGEE_PIXMASK bitmask where significant superperstence is known to occur, with three different flags corresponding to the level of the effect: PERSIST_HIGH, PERSIST_MED, and PERSIST_LOW. In addition, we have a visit level flag, APOGEE_STARFLAG, for each object, with bits that that get set when a significant number of pixels (>20%) of the spectrum are affected, again split into categories PERSIST_HIGH, PERSIST_MED, and PERSIST_LOW. In addition, we look for evidence in the spectra of a “jump” in flux between the “green” and the “blue” chips, and if this is present at an easily recognized level, we set a flag PERSIST_JUMP_HIGH or PERSIST_JUMP_LOW if the “blue” portion of the spectrum seems abnormally high or abnormally low (the latter could occur, e.g., if a sky fiber from a region affected by superpersistence is used for sky subtraction, although the pipeline takes some steps to try to avoid this occurance).

In the combined spectra, we provide star level flags that are bitwise AND and bitwise OR combinations of the visit APOGEE_STARFLAG flags, so you can tell whether a given object was marked as having a significant number of pixels in the superpersistence region in all or any of the visit spectra that went into the combination.

As noted in Holtzman et al. 2015, the effects of persistence can be seen in some of the individual element abundances.