Accessing Astra Files

Pipeline data products whose names start with “astra” contain the outputs of the astra pipelines (e.g. ASPCAP, SnowWhite). There are several ways to access them.

Pipeline summary data products

Pipeline summary data products contain the derived parameters from the pipelines, such as the Teff, log g, [M/H], and abundance ratios (if any). They also have additional targeting and RV information.

To browse and download the summary fits files: Science Archive Server

SQL

All Astra data products are hosted in a SQL database, along with all other SDSS data, with a web interface called CasJobs. Each Astra data product is provided as a table. SQL joins to other SDSS tables are possible using columns like sdss_id and gaia_dr3_source_id. A summary of all available tables and descriptions of the columns in those tables is available using the schema browser. Below is an example query, selecting [Fe/H] and [Mg/H] for all targets in a carton, where the carton is selected by joining to targeting tables.

SELECT lite.sdss_id, lite.fe_h, lite.mg_h
FROM lite_all_star lite
JOIN mos_sdss_id_stacked stacked ON stacked.sdss_id = lite.sdss_id
JOIN mos_target targ ON  targ.catalogid = stacked.catalogid25
JOIN mos_carton_to_target c2t ON c2t.target_pk = targ.target_pk
JOIN mos_carton cart ON cart.carton_pk = c2t.carton_pk
WHERE cart.carton = 'mwm_snc_100pc_apogee';

Further examples of SQL queries using CasJobs are available.

Pipeline spectrum data products

The amount of data in the Astra pipeline spectrum data products, including co-added spectra and best-fit model spectra, is considerably larger than in the summary files. These data are not loaded into CasJobs. We anticipate that most users will want to look at spectrum data products for only the stars they are most interested in. Below we describe how to access the pipeline spectrum data products on the SAS.

Spectrum access tutorials

Tutorials detailing access for BOSS and APOGEE spectra using the convenient Python package sdss_access are available.

Path conventions

Astra data products are stored in the folder defined by the $MWM_ASTRA environment variable. All Astra data products are defined in the `sdss/tree` product and can be accessed using the `sdss_access` Python product.

The fields in every Astra output file can be directly used to get the path of the spectrum analyzed.

Storing thousands of files in a single folder can lead to disk performance degradation. For this reason, Astra uses the last four (0-padded) digits of `sdss_id` to limit the number of files that are stored in a single folder. For example, if a star has an `sdss_id` of 1234567890 then we use the last four digits to specify the path. In this scenario for Data Release 19, where the Astra version is 0.6.0, the `mwmStar` file for Data Release 19 will be found at:

`$MWM_ASTRA/0.6.0/spectra/star/78/90/mwmStar-0.6.0-1234567890.fits`

The earnest reader might rightly wonder why we didn’t use the first four digits of `sdss_id` instead of the last four digits. Due to technical reasons of how `sdss_id` values are assigned, and the optimization process that decides which stars are observed, the distribution of stars observed by the first four digits is non-uniform. This would lead to some `XX/YY` folders having many tens of thousands of files, and many `XX/YY` folders having none. In short, in practice it would not have effectively limited the maximum number of files per folder.

If we wanted the corresponding `mwmVisit` file, or the ASPCAP pipeline results for this star, we use the same convention to find the path. The `mwmVisit` file will be in `spectra/visit`:

`$MWM_ASTRA/0.6.0/spectra/visit/78/90/mwmVisit-0.6.0-1234567890.fits`

The pipeline outputs for analyses performed on the coadded spectra will be stored in `results/star/`, and the pipeline outputs for the analyses of the visit spectra will be stored in `results/visit/`, where the sub-folders are by the same `sdss_id` convention:

$MWM_ASTRA/0.6.0/results/star/78/90/astraStarASPCAP-0.6.0-1234567890.fits

$MWM_ASTRA/0.6.0/results/visit/78/90/astraVisitASPCAP-0.6.0-1234567890.fits

The disk structure might look something like this:

0.6.0/	Astra version for Data Release 19
aux/	Auxiliary data products
pipelines/	Intermediate pipeline outputs (for experts)
spectra/	Astra-created spectrum data products
star/	`mwmStar` files stored here
XX/
YY/
visit/	`mwmVisit` files stored here
XX/
YY/
results/	Spectrum-level files from pipelines
star/	Pipeline spectrum-level files for stacked spectra (e.g., `astraStarASPCAP`)
XX/
YY/
visit/	Pipeline spectrum-level files for visit spectra (e.g., `astraVisitCORV`)
XX/
YY/
summary/	Summary output files (e.g., `mwmTargets`, `mwmAllStar`, `astraAllStarASPCAP`)

In the example above, `XX` always refers to the 4th- and 3rd-last digits of `sdss_id`, and `YY` always refers to the 2nd-last and last digit of `sdss_id`.