Using APOGEE Spectra

Note that the APOGEE data in DR15 are identical to those in DR14, but use the DR15 documentation pages for reference.

There are several important aspects relevant to all APOGEE spectra that you should be aware of as you examine and use the spectra. See the Table of Contents for important APOGEE parameters and aspects discussed on this page.

Types of APOGEE Spectra

Several different types of APOGEE spectra are available:

Individual Visit Spectra:
of each visit to each star are available in apVisit files.
Combined Spectra:
from all visits to a star are available in apStar files.
Pseudo-Continuum Normalized Spectra:
that are used in the derivation of stellar parameters are available in aspcapStar">aspcapStar files.

The construction of these files is described in other locations as linked above.

The combined spectra in the apStar files may be the most useful. These combined spectra are generated by resampling individual visits onto a common, logarithmically-spaced wavelength scale after removing each visit's derived radial velocity (log λi+1 – log λi = 6E-6, with a common starting wavelength of 15100.802 Angstroms). The resulting spectra are in rest, vacuum wavelengths. Data from the entire APOGEE wavelength range (which includes some gaps ) are included in a single array. The wavelength scale is recorded in the header in standard FITS cards; thus, standard software should allow straightforward plotting of flux vs. wavelength and perform other tasks.

The apStar files also include the individual visit spectra that have been resampled and shifted to rest wavelength.

The apVisit files contain individual visit spectra before resampling and removal of radial velocities. Note that while these have wavelength calibration information, the native wavelength scale is not an evenly spaced linear or logarithmic scale. The wavelength information is included as a separate wavelength array. The wavelength information is also available in a table that provides the parameters of the function used to fit the pixel-wavelength relation and this information is required if you wish to plot flux against wavelength. For the apVisit files, spectra from each of the three chips are stored in different rows in the image extensions.

The spectra can be downloaded from the Science Archive Server (SAS) as described on the data access page.

Data Quality Flags

Information about the data quality of APOGEE spectra is encoded in several different bitmasks that are included with the spectra.

  • At the individual pixel level, the visit and combined spectra include a mask array in HDU3 in the apVisit and apStar files, with bits set according to the APOGEE_PIXMASK bitmask. This bitmask flags both bad pixels and "warning" pixels; data in bad pixels are definitely unreliable, while data in "warning" pixels may be unreliable.
  • At the individual visit level, information is included in a APOGEE_STARFLAG bitmask that is recorded in the FITS header as card STARFLAG.
  • At the combined spectrum level, there is a bitwise OR and a bitwise AND of the APOGEE_PIXMASK bitmasks from the individual visits that is output in HDU3 in the apStar files (rows 1 and 2), as well as a bitwise OR and a bitwise AND of the APOGEE_STARFLAG bitmasks from the individual visits that are recorded in the apStar headers in the STARFLAG and ANDFLAG cards.
  • If you inspect the bitmasks you will see that data in some locations may not be reliable; some of the specific reasons for this are discussed below. In many cases, unreliable pixels may come in "blocks" of contiguous pixels; this can arise even if only a single pixel with the block has bad data because the combination of dithered spectra and spectra from multiple visits requires that values in the final spectra have contributions from multiple pixels from the raw input spectra. If a bad pixel is expected to have a significant contribution to any pixel in the final spectrum, then that final pixel will be flagged.

Vacuum Wavelengths

The wavelength calibration of the APOGEE data is done using vacuum wavelengths. However, the wavelengths of atomic transitions in the optical and infrared are usually quoted at standard temperature and pressure (S.T.P.); this is how the CRC Handbook of Chemistry and Physics lists them for transitions redward of 2000 Ångstroms. Thus, spectral lines associated with specific atomic transitions may require converting the SDSS data to the equivalent values at S.T.P.  For APOGEE data, the conversion from Ciddor (Applied Optics, Vol 35, p 1566, 1996) has been employed to convert between vacuum and air wavelengths. For a vacuum wavelength (VAC) in Ångstroms, convert to air wavelength (AIR) using the equation:

AIR = VAC / (1.0 +  5.792105E-2/(238.0185E0 - (1.E4/VAC)^2) + 1.67917E-3/( 57.362E0 - (1.E4/VAC)^2)

Wavelength Coverage and Detector Gaps

The APOGEE spectra are recorded onto three different detectors ("chips"). While the overall coverage ranges from 1.514 to 1.696 microns, there are small gaps between the detectors, which result in gaps in the wavelength coverage. While all of the APOGEE spectra lie in the infrared H- band, sometimes the chips are referred to as the "blue", "green", and "red" chips, going from the shorter wavelengths to longer wavelengths. Data products refer to the separate chips as chips "a", "b", and "c", in the order in which they are read out. As it turns out, the "red" chip is the first one to read out, so this nomenclature is in reverse wavelength order. The following table explains the terminology.

chip name start wavelength end wavelength central dispersion
a "red" 1.647 μm 1.696 μm -0.236 A/pix
b "green" 1.585 μm 1.644 μm -0.283 A/pix
c "blue" 1.514 μm 1.581 μm -0.326 A/pix

Note that the starting and ending wavelengths vary slightly from fiber to fiber because of variations of their placement along the instrument pseudo-slit. The dispersion varies with wavelength, and, to a lesser extent, by fiber.

Imperfect Subtraction of Night Sky Lines

The night sky lines (i.e., "airglow"), primarily from OH emission in the Earth's atmosphere, can be extremely bright. In the current version of the pipeline, a combination of the emission from sky fibers at sky positions near those of each target is subtracted from the spectra of the targets. However, this subtraction is almost always imperfect because (1) the sky spectra need to be shifted in wavelength to match the object spectra (due to variations in the locations of the fibers along the pseudo-slit) and (2) the line spread function (LSF) of different fibers varies because of changes in image quality across the field-of-view of the spectroscopic camera. Because the night sky lines are so bright, even small fractional variations due to these issues can cause the subtraction to be very noticeably imperfect; thus most sky lines are either under- or over-subtracted.

Note that, even if the airglow subtraction were perfect, the area of the spectrum "under" the sky lines would be of significantly lower signal-to-noise, because of the large Poisson contribution from the bright lines. Significant effort has not been made yet into improving the sky subtraction due to this. Additional work along these lines may be attempted for subsequent data releases.

The imperfect night sky line subtraction does have the unfortunate result of making the APOGEE spectra appear a bit "ugly" to a quick, casual inspection. The APOGEE data products (e.g., apVisit and apStar files) include a record of the sky spectrum that was subtracted, and it is possible to use this as a guide for recognizing pixels that are likely to be affected by imperfect sky subtraction.

Uncertainty Arrays

All APOGEE spectra include an array of uncertainties ("errors") for each pixel; these are given as the standard deviation of the flux values. These uncertainties are initially calculated from the raw pixel data based on the inherent properties of the detectors (gain and readout noise). These raw errors are propagated into the uncertainties for subsequent data products.

However, in downstream spectral products, data for any given pixel may have been derived from some combination of pixels in the raw data and data from any individual raw pixel may contribute to more than one pixel in the combined spectra. As a result, there may be correlated errors between pixels. This can occur in visit spectra because these are combined from two separate dithered observations. If dithers are exactly spaced by 0.5 pixels, then the spectral combination software just interleaves the two dithered exposures, but if the dithers are slightly imperfect (as they generally are), any pixel in the combined well-sampled spectrum will have contributions from multiple raw pixels. For the visit-combined apStar spectra, the pixels have contributions from multiple raw pixels, because the apStar spectra are RV-corrected and resampled onto a common wavelength grid. Although the uncertainties are propagated into the apVisit and apStar spectra, this propagation ignores the correlation of uncertainties that result from having processed pixels that are derived from multiple raw pixels.

Multiple observations of selected targets have been used to estimate empirical uncertainties, and these demonstrate that, for most targets, the calculated uncertainties are reasonable, i.e., the scatter from observation to observation is comparable to the estimated uncertainty in each observation. However, for very bright targets the calculated uncertainties are almost certainly underestimated because the accuracy of these data is most likely limited by systematic uncertainties from the data processing and in the calibration data products. These have not yet been fully quantified, but it is likely that there is an uncertainty "floor" around the 0.5% level, i.e., a maximum S/N of ~200. Such as floor has not been set in the spectrum uncertainty arrays, and so, users should be aware that there is a likely maximum S/N~200.

Bad Pixels/Missing Regions

The IR detectors used for APOGEE are not cosmetically perfect. Small regions of each chip are bad, and there are a significant number of individual bad or "hot" pixels. These are flagged during the data processing and can lead to bad or missing regions in any given spectrum. Because visit spectra are combined from multiple individual dithered spectra, a single bad pixel can propagate into multiple pixels in the visit-combined spectra. In combination with the poorly subtracted skylines, these bad pixels can have the effect of making individual visit spectra look rather "ugly". The mask arrays can be used to identify the cause of most bad pixels.

Because any given star will typically not use the same fiber for different visits, combined spectra generally look somewhat cleaner, especially if the observed radial velocity (including differences in barycentric RV) of a target differs significantly from visit-to-visit. However, even if the combined spectra do not have regions with missing data, there may be regions where the noise level is elevated if that portion of the spectrum landed on a bad region of one of the arrays in one or more of its visits.

Ghosts

The use of VPH gratings results in the production of some "ghosts" on the 2-D images. The most prominent of these is the "Littrow ghost", which for the APOGEE data falls somewhere in the wavelength region 1.624 to 1.626 microns, depending on the fiber.

The amplitude of the ghost depends on the brightness of other stars in the field, so it does not always contribute a significant amount of flux. Pixels possibly affected by the "Littrow ghost" are flagged with the LITTROW_GHOST bit in the APOGEE_PIXMASK bit mask.

Fiber Cross Talk

To pack the spectra of as many stars as possible across the APOGEE detectors, the spacing of adjacent spectra is ~ 6.5 pixels (measured between adjacent PSF peaks). Therefore, the wings of the PSF overlap slightly with adjacent spectra and the effect is particularly apparent if an object is located adjacent to a much brighter object. To mitigate this, the targets on each plate are sorted into three brightness categories -- bright (B), medium (M), and faint (F) -- and these categories are placed along the pseudo-slit (and hence, on the detectors) in the order FMBBMF FMBBMF ... In principle, an object in the faint class or a sky fiber should never land next to a much brighter object. Yet, the magnitude ranges of these brightness categories can be broad and make it possible for objects of significantly different brightness to have spectra adjacent to one another.

The extraction portion of the data reduction pipeline accounts for contributions of light from the two adjacent spectra for each object. However, the quality of this extraction depends on high-quality knowledge of the amplitude of the wings of the light distribution. In cases where adjacent targets are significantly brighter than a given object, small inaccuracies in the PSF model may lead to significant errors in the extraction of the spectrum.

For each visit, a bit is set in the APOGEE_STARFLAG bitmask if an adjacent object is more than 100 times brighter than the star itself (VERY_BRIGHT_NEIGHBOR) or more than 10 times brighter (BRIGHT_NEIGHBOR). The former case, which is rare, is automatically considered as a bad spectrum and will not be included in the combined spectrum.

Incomplete Data Acquisition

DR15 contains all APOGEE-1 data as well as APOGEE-2 data that were collected before mid-July 2016. The full complement of visits, leading to a net S/N > 100, were successfully acquired for the majority of APOGEE targets. However, for a small number of APOGEE fields, the full set of visits were not completed. Therefore, the S/N for the combined spectra of some stars in these fields (generally the faintest ones) may not achieve the survey goal of S/N > 100. Note that DR15 contains spectra acquired at APO only (the APOGEE-2N site).

Persistence in the "Blue" Detector

Some areas of the detectors used in the APOGEE instrument suffer from a problem that is referred to as "superpersistence". In these locations on the detector, previous exposure to light causes a glow in subsequent images that can be substantial and last for a significant amount of time. The problem is most severe on about 1/3 of the "blue" chip, i.e. the chip that records wavelengths between 1.514 and 1.581 microns. The orientation of the chip is such that this region affects essentially all of this wavelength region for the fibers (1/3 of the total) that put light into this area. There are also regions in the "green" chip that are affected by a lower level of superpersistence, but these are not so clearly delineated by either fiber or wavelength coverage.

After the completion of the APOGEE-1 survey in summer 2014, the instrument was opened and the blue detector, which had the worst impact from superpersistence, was replaced with a detector with better performance. While this will mitigate the effects of superpersistence on data taken after summer 2014, DR15 includes data taken from before this date as well.

The effect of superpersistence depends on the prior exposure history for each fiber, and likely on the brightness of the target being recorded. Some level of mitigation is provided by the fiber management system described in the Fiber Cross Talk section, because the grouping of fibers according to target brightness makes it relatively uncommon for a faint target to be observed through a fiber that was previously placed on a bright target. However, because the magnitude ranges that define these categories are broad, there can still be cases where faint targets follow relatively brighter ones. In addition, calibration flat field exposures are taken between every plate to map the distribution of light between fibers and to measure the fiber-to-fiber throughput variations, and these roughly evenly-illuminated frames are sufficient to give rise to some superpersistence.

Superpersistence is a complex phenomenon. In DR13/DR14, we attempted to implement a first order superpersistence correction, based on some calibration data that were obtained, and scaling the results to try to match persistence observed in dark frames taken before most of the science exposures. We also implemented a scheme in which pixels affected by persistence are assigned a lower weight when different visits are combined. Additional description of how this problem is addressed will be available in Holtzman et al. (2018, submitted).

The effect of superpersistence can be significant and is easily noticed: the flux levels in the region of the spectrum affected can be enhanced by tens of percent or more. This enhancement is likely to have some wavelength dependence meaning that spectral features may be distorted. However, depending on the brightness of the target and the preceding ones, it is not guaranteed that the spectra are adversely affected at a significant level, so we do not flag all data that falls within the superpersistence region bad by default.

In the data reduction pipeline, we flag all pixels in the regions where significant superperstence is known to occur in the APOGEE_PIXMASK bitmask using three different flags corresponding to the level of the effect: PERSIST_HIGH, PERSIST_MED, and PERSIST_LOW. In addition, we have a visit level flag, APOGEE_STARFLAG, for each object, with bits that get set when a significant number of pixels (>20%) in the spectrum are affected, which are again split into categories of PERSIST_HIGH, PERSIST_MED, and PERSIST_LOW. In addition, we look for evidence in the spectra of a "jump" in flux between the "green" and the "blue" chips, and if this is present at an easily recognized level, we set a flag PERSIST_JUMP_HIGH or PERSIST_JUMP_LOW if the "blue" portion of the spectrum seems abnormally high or abnormally low (the latter could occur, e.g., if a sky fiber from a region affected by superpersistence is used for sky subtraction, although the pipeline takes some steps to try to avoid this occurrence).

In the combined spectra, star level flags are provided that are bitwise AND and bitwise OR combinations of the visit APOGEE_STARFLAG flags to indicate whether a given object was marked as having a significant number of pixels in the superpersistence region in all or any of the visit spectra comprising the combination. Starting in DR13, a scheme was implemented in which pixels affected by superpersistence are given inflated uncertainties to reduce their impact on the combined spectra. For stars in which some visits were affected by persistence but others were not, this has the effect of increasing the random uncertainties because the visits affected by the superpersistence are essentially ignored. For stars in which all visits are impacted by persistence, this has the effect of giving the persistence-affected wavelengths less weight in the ASPCAP fitting than pixels at other wavelengths.

As noted in Holtzman et al. (2015), the effects of persistence can be seen in some of the individual element abundances for the DR12 data. While this may be true for some objects in the current data release, comparison of results for stars unaffected by persistence to those affected by persistence in all visits suggests that the down-weighting scheme has helped significantly to mitigate persistence effects. The stars most likely to still be impacted are those that are the faintest stars or those with weak absorption features. Additional discussion on these issues relevant to the DR13/DR14 reductions can be found in Holtzman et al. (2018, submitted).