Moving from APOGEE DR17 to DR19

Many science cases, such as chemical cartography of the Galaxy, will want to use the DR19 dataset to expand upon DR17 results, employing similar analysis. In this page, we highlight some key differences between the versions of ASPCAP in DR17 and DR19 and between the structure of the files that hold the ASPCAP data. We also discuss how APOGEE spectra from the DR17 pipeline were processed by Astra for DR19.

In all the Astra summary files, such as astraAllStarASPCAP, the APOGEE results are in HDU2

This table takes common data products from Data Release 17 and lists the closest analogue produced by Astra:

DR17 Product(s)Similar DR19 Product(s)Comments
apStarapStar, mwmVisit, mwmStarAll combined spectra (BOSS and APOGEE) are stored in `mwmStar` files, and rest-frame visit spectra are stored in `mwmVisit` files.
specFullspecFull, mwmVisit
allStar, allStarLiteastraASPCAP
allVisitmwmAllVisit
aspcapStarastraStarASPCAP

Difference between DR17 ASPCAP and DR19 ASPCAP

ASPCAP is integrated in Astra, with some critical caveats. ASPCAP itself can be described as a set of (hard-learned) choices on how to best execute FERRE with APOGEE spectra. The ASPCAP pipeline in Astra has no code in common with the ASPCAP pipeline used in SDSS-IV: it is a complete rewrite to nearly mimic the decisions made by the earlier pipeline. Intended differences between the ASPCAP version in Astra and that in SDSS-IV are documented below. 

  • The logic for the initial guess has changed. In SDSS-IV Data Release 17 the stellar parameters from the DOPPLER code were used to initialise FERRE in the coarse stage. The DOPPLER code uses The Cannon as a forward-model to measure the stellar radial velocity. In SDSS-V Data Release 19 we use the APOGEENet (Version 2) stellar parameters as the initial guess for FERRE in the coarse stage. The initial guess for microturbulence is set by the initial guess of surface gravity, using the same relation as used for SDSS-IV Data Release 17.
  • The continuum normalization treatment for the abundances stage is different. In SDSS-IV Data Release 17 the continuum was fit during the stellar parameter stage, and then the continuum was re-fit for each element abundance determination. That means that the fitted continuum can be different for the stellar parameter stage, and different for every element abundance. The other potential concern is that for some elements there are only a few pixels that are not masked, which means FERRE would be simultaneously fitting a flexible continuum and an element abundance from just a few pixels. For these reasons, for consistency in Data Release 19 we chose to fit the continuum during the stellar parameter stage, and then keep that continuum fixed when determining the elemental abundances.
  • The field names in the ASPCAP tables have changed. Some fields no longer exist (e.g., DDO51 photometry), many fields are new, and many have been renamed to be consistent with the glossary conventions [citation needed] across all data products. 
  • The flag definitions have changed. Some new flags have been added, some old flags were never used, and the ordering of flags has changed. All ASPCAP flags are documented on this page. In Data Release 17 there was a `STAR_BAD` flag constructed from the `ASPCAPFLAG` values. In Data Release 19 information about the pipeline is stored in `result_flags` (which is the closest analogue of `ASPCAPFLAG`), and `flag_bad` is constructed from the `result_flags` values (where `flag_bad` is the closest analogue of `STAR_BAD` in Data Release 17).
  • The post-processing steps have changed. These post-processing steps are often described as calibration steps. In brief, the raw outputs from FERRE are consistently found to be biased relative to other data sources that are less model-dependent than spectroscopy (e.g., asteroseismology, interferometry, and photometry). For these reasons, ASPCAP has always had a post-processing step where the raw values of FERRE are calibrated onto some reference scale. The ingredients for how these steps are performed has changed with each version of ASPCAP, just as they have in Data Release 19. For this reason – and many others listed in this section – users wanting to compare ASPCAP results in Data Release 17 and 19 should expect to see differences. While the input model atmospheres, line lists, and spectral synthesis libraries are the same, the code is not the same, and the set of decisions taken to determine stellar parameters and abundances is very different from Data Release 17. All raw values from FERRE are available in the ASPCAP tables, in addition to the values computed from post-processing. If users want to make such comparisons between Data Release 17 and Data Release 19, a closer comparison would be to compare the raw outputs of FERRE in both Data Releases. In Data Release 17 these were encoded by the `PARAM` and `FPARAM` arrays, whereas in Data Release 19 these are stored in human-readable fields that have a `raw_` prefix (e.g., `raw_teff`). However, users should not expect these to be identical either, because it depends on the choice of initial guess.
  • We experienced unresolved issues with FERRE timeouts, which meant a random subset of good spectra will sometimes have no results. In the course of processing Data Release 19 spectra we experienced situations where FERRE would stall. This occurred more frequently with low signal-to-noise spectra where the relative impact of imperfect sky subtraction was more problematic, however we were unable to explicitly diagnose what conditions would cause FERRE to hang. For example if we took one spectrum that caused FERRE to stall indefinitely and slightly inflated the flux uncertainties in 4 pixels, FERRE would no longer stall.
  • We developed tools to identify times that FERRE stalls, kill the process, and re-start FERRE with the remaining spectra (randomly sorted). If the set of spectra to be analyzed by FERRE is small enough then this solution was sufficient. However, often we are running FERRE with very large sets of spectra. When this behaviour happened repeatedly (e.g., more than 10 times for one FERRE execution), the remaining spectra in that set were considered problematic and will not have results. This could mean that some spectra would be labelled problematic when they weren’t: they were just part of a set of spectra that were problematic.

DR17 APOGEE spectra in Astra

Astra is designed to seamlessly handle countless different types of spectra. As a stop-gap measure to maximize the scientific impact of Data Release 19, Astra ingests the APOGEE reduced data products from SDSS-IV and tries to seamlessly treat these spectra as if they were part of SDSS-V. There are some caveats and edge cases to how the SDSS-IV APOGEE DR17 spectra are handled, but these are all explicitly described here. The key things that an astronomer will want to know are listed below, pseudo-ordered by some level of importance (most important first):

  • In every summary table, from every pipeline, the APOGEE DR17 spectra are identifiable by selecting on the `apred` column to be equal to “dr17” (e.g., instead of 1.4, which is the tag used by the SDSS-V APOGEE Data Reduction Pipeline for SDSS-V Data Release 19).
  • Astra creates mwmVisit and mwmStar files from APOGEE DR17 spectra. If a star was observed in SDSS-IV with APOGEE, then there will be a mwmVisit file – and often, but not always – a mwmStar file, as if it were observed by SDSS-V. 
  • When Astra creates a mwmStar or mwmVisit file from SDSS-IV APOGEE DR17 spectra, the data arrays (flux, inverse variance) are copied exactly from the SDSS-IV apStar file. No re-sampling or interpolation is done on the data.
  • If a star was observed with APOGEE in SDSS-IV and with APOGEE in SDSS-V, then this is a special case. In these situations astronomers will find the mwmVisit file includes spectra from both SDSS-IV and SDSS-V, and those spectra can be differentiated by the `apred` column. However, in the mwmStar file we only use the SDSS-V visits to produce a co-added spectrum: the SDSS-IV visits are not used in the co-add. This can be checked by the `in_stack` field in the mwmVisit file: if the star was observed with APOGEE in SDSS-IV and SDSS-V then the `in_stack` column will be `False` for all SDSS-IV observations. The reason that we do not currently co-add spectra between SDSS-IV and SDSS-V is because they were reduced differently, and we have not fully quantified potential differences in the reduced spectra. This is in line with the decision not to coadd APOGEE spectra from different telescopes. Expert users should know that these decisions will be reconsidered for Data Release 20: analyses should not hard-code in this assumption.
  • Keywords from the SDSS-IV APOGEE DR17 files are homogenised to the SDSS-V keywords and the Astra glossary. For example, what was called V_HELIO in DR17 will appear as V_RAD in the SDSS-V files. There are a very small number of these translations that have been done to include the SDSS-IV APOGEE DR17 spectra in Milky Way Mapper as seamlessly as possible..
Back to Top