Bulk Data Downloads

All data can be downloaded directly from data.sdss.org using the rsync or wget commands. Access is also available via Globus Online. The Data Model page has a description of the directory structure and file formats. Note that the total SDSS data volume is > 125 TB; see the data volume table. If you need a substantial fraction of that data (>1 TB), please contact the helpdesk to arrange a custom data transfer. This will be faster for you and easier on our servers.

To learn how to download MaNGA data cubes, see the MaNGA data access page.

NOTE: all rsync commands on this page have --dry-run added to them, and all wget commands have --spider added to them. You have to remove those command line arguments for these commands to actually download data. wget commands use the same URL as you would in a web browser, e.g.,

wget --spider https://data.sdss.org/sas/dr14/eboss/spectro/redux/platelist.fits

or for rsync drop the “sas” from the URL, e.g.,

rsync --dry-run -lv rsync://data.sdss.org/dr14/eboss/spectro/redux/platelist.fits .

If you are having any difficulty with rsync URLs, check the notes below. The number of rsync connections is throttled but the number of wget connections is not. Thus it is recommended to use wget to initially fetch the data, and use rsync only to confirm that the data you have is correct and complete. The SAS website data.sdss.org/sas/dr14 (US Mountain) is completely mirrored at mirror.sdss.org/sas/dr14 (US Pacific). If you have difficulty connecting to data.sdss.org, try mirror.sdss.org instead by using an analogous command, e.g.,

wget --spider https://mirror.sdss.org/sas/dr14/eboss/spectro/redux/platelist.fits

or

rsync --dry-run -lv rsync://mirror.sdss.org/dr14/eboss/spectro/redux/platelist.fits .

Also check the status page for outage announcements.

Globus Online

SDSS data are also available via Globus Online using the endpoint sdss#public (US Mountain). For large transfers, Globus is significantly faster and more robust than using wget or rsync. Globus Online requires a separate account, but once that is set up Globus offers a “fire-and-forget” transfer that automatically optimizes transfer settings, retries any failures, and emails you when your transfer is done. The Globus Connect tool allows you to use globus to download data to your laptop or other computers which are not permanent Globus endpoint servers.

APOGEE Catalog Data

Catalogs of parameters derived from the APOGEE infrared spectra and matched are documented on the spectra data page. These can be directly downloaded from the links on that page, or via wget commands. For example, to download the stellar parameters for all APOGEE star spectra:

wget --spider https://data.sdss.org/sas/dr14/apogee/spectro/redux/r8/allStar-l31c.2.fits

To download the catalog information for each APOGEE visit spectrum:

wget --spider https://data.sdss.org/sas/dr14/apogee/spectro/redux/r8/allVisit-l31c.2.fits

APOGEE Spectra Per-Star Files

The combined spectra for each star can be found in the apStar files. In the path to this file, APRED_VERS refers to the reduction version used to extract the spectrum for each visit and APSTAR_VERS refers to the version of the combination of the spectra into a single spectrum.

There is a large directory of location IDs, each of which corresponds to a particular line of sight in the survey. Within each of those directories, the spectra are organized by their APOGEE ID. For example, one of these files may be downloaded as follows:

wget --spider https://data.sdss.org/sas/dr14/apogee/spectro/redux/r8/stars/apo25m/4289/apStar-r8-2M05370702+6137006.fits

In this case, APRED_VERS is r8 and APSTAR_VERS is stars. Observations taken during commissioning were not combined with observations taken after commissioning, so stellar spectra from commissioning data are stored in files with the name apStarC. For example:

wget --spider https://data.sdss.org/sas/dr14/apogee/spectro/redux/r8/stars/apo25m/4105/apStarC-r8-2M16380721+3638388.fits

Note that there can be both apStar and apStarC results for the same star, if it was observed both before and after commissioning.

To download these spectra in bulk, you can generate a list of spectra you wish to download in a text file where each line looks like “[LOCATIONID]/[FILENAME]”, for example:

4289/apStar-r8-2M05370702+6137006.fits

Then use wget:

wget --spider -nv -r -nH --cut-dirs=7 \
    -i speclist.txt \
    -B https://data.sdss.org/sas/dr14/apogee/spectro/redux/r8/stars

To download all of the apStar files (about 505 GB total), it is best to use rsync:

rsync --dry-run -aLvz --include "[0-9][0-9][0-9][0-9]/" \
    --include "apStar-*[0-9][0-9][0-9][0-9][0-9][0-9][0-9].fits" --exclude "*"\
    --prune-empty-dirs --progress \
    rsync://data.sdss.org/dr14/apogee/spectro/redux/r8/stars/ stars/

The majority of the stars have stellar parameters determined, with corresponding best-fit, pseudo-continuum-normalized spectra. The combined spectra for each star, along with the ASPCAP fits, can be found in the aspcapStar files. As for apStar files, there is a large directory of location IDs with the resulting files. For example, one of these files may be downloaded as follows:

wget --spider https://data.sdss.org/sas/dr14/apogee/spectro/redux/r8/stars/l31c/l31c.2/4289/aspcapStar-r8-l31c.2-2M05370702+6137006.fits

To download these spectra in bulk, you can generate a list of spectra you wish to download in a text file where each line looks like “[LOCATIONID]/[FILENAME]”, for example:

4289/aspcapStar-v603-2M05370702+6137006.fits

Then use wget:

wget --spider -nv -r -nH --cut-dirs=9 \
    -i speclist.txt \
    -B https://data.sdss.org/sas/dr14/apogee/spectro/redux/r8/stars/l31c/l31c.2

To download all of the aspcapStar files (about 37 GB total), it is best to use rsync:

rsync --dry-run -aLvz --include "[0-9][0-9][0-9][0-9]/" \
    --include "aspcapStar*.fits" --exclude "*"\
    --prune-empty-dirs --progress \
    rsync://data.sdss.org/dr14/apogee/spectro/redux/r8/stars/l31c/l31c.2/ l31c.2/

New for Data Release 14, red giant stars have stellar abundances determined by The Cannon, and can be found in the cannonStar files.

wget --spider https://data.sdss.org/sas/dr14/apogee/spectro/redux/r8/stars/l31c/l31c.2/cannon/4289/cannonStar-r8-l31c.2-2M05370702+6137006-xh-censor.fits

To download these spectra in bulk, you can generate a list of spectra you wish to download in a text file where each line looks like “[LOCATIONID]/[FILENAME]”, for example:

4289/cannonStar-r8-l31c.2-2M05370702+6137006-xh-censor.fits

Then use wget:

wget --spider -nv -r -nH --cut-dirs=9 \
    -i speclist.txt \
    -B https://data.sdss.org/sas/dr14/apogee/spectro/redux/r8/stars/l31c/l31c.2/cannon

To download all of the cannonStar files (about 25 GB total), it is best to use rsync:

rsync --dry-run -aLvz --include "[0-9][0-9][0-9][0-9]/" \
    --include "cannonStar*.fits" --exclude "*"\
    --prune-empty-dirs --progress \
    rsync://data.sdss.org/dr14/apogee/spectro/redux/r8/stars/l31c/l31c.2/cannon/ cannon/

Optical Spectra Versions

The SDSS optical spectra are split into several versions:

Optical Spectra Catalog Data

Catalogs of parameters derived from the SDSS/BOSS/SEQUELS/eBOSS optical spectra and matched to photometric data are documented on the optical spectra data page. These can be directly downloaded from the links on that page, or via wget commands. For example, to download the redshifts and classifications of all SDSS spectra (4.5 GB):

wget --spider https://data.sdss.org/sas/dr14/sdss/spectro/redux/specObj-dr14.fits

Or to get the associated photometric position based matches (12 GB):

wget --spider https://data.sdss.org/sas/dr14/sdss/spectro/redux/photoPosPlate-dr14.fits

The stellar parameter (SSPP) results can be downloaded similarly (1.8 GB):

  wget --spider https://data.sdss.org/sas/dr14/sdss/sspp/ssppOut-dr12.fits

Note: This is unchanged since DR12, thus ssppOut-dr12.fits appears in both the dr14 and dr12 directories.

Optical Spectra Per-Object Files

If you want a subset of the spectra, the most convenient form may be the spec files with one file per PLATE-MJD-FIBER containing the coadded spectrum, the redshift and classification fits, spectral line fits, and optionally the individual exposures which contributed to the coadd. These are located at:

Beneath each of those directories, the spectra are organized by plate in the form

PLATE/spec-PLATE-MJD-FIBER.fits

e.g.,

  3586/spec-3586-55181-0016.fits
  3609/spec-3609-55201-0646.fits
  3661/spec-3661-55614-0020.fits
  ...

To download these spectra in bulk, generate a list of spectra you wish to download in a text file of that format and then use wget:

  wget --spider -nv -r -nH --cut-dirs=7 \
      -i speclist.txt \
      -B https://data.sdss.org/sas/dr14/eboss/spectro/redux/v5_10_0/spectra/

Optical Spectra Per-Object Lite Files

A “lite” version of the above files are also available in the “spectra/lite/PLATE/” subdirectories. These contain the same coadd and catalog information as the full spec files, but don’t include the individual exposures which contributed to the coadd. For example, to download the “lite” version of the above QSO files (~42 GB instead of ~250 GB):

  wget --spider -nv -r -nH --cut-dirs=8 \
    -i speclist.txt \
    -B https://data.sdss.org/sas/dr14/eboss/spectro/redux/v5_10_0/spectra/lite/

Spectra per-Plate Files

The spectra are also available grouped by plate, with all 640 (SDSS) or 1000 (BOSS) spectra in a single file. These are the original outputs of the spectroscopic pipeline and are itemized on the spectro pipeline page, including where they are in the SAS directory structure. The primary files are:

File Description
spPlate Coadded spectra
spCFrame Individual exposure spectra
spZbest Redshifts and classifications
spZall Redshifts and classifications including second, third, etc. best fits
spZline Spectral line fits

To download all the spPlate files (about 344 GB total) for eBOSS, BOSS and SEQUELS:

  rsync --dry-run -aLvz --include "????/" --include "spPlate*.fits" \
    --exclude "*" --exclude "spectra/*" \
    --prune-empty-dirs --progress \
    rsync://data.sdss.org/dr14/eboss/spectro/redux/v5_10_0/ v5_10_0/

Or for spPlate, spZall, spZbest, spZline:

  rsync --dry-run -aLvz --include "????/" \
    --include "spPlate*.fits" --include "spZ*.fits" \
    --exclude "*" --exclude "spectra/*" \
    --prune-empty-dirs --progress \
    rsync://data.sdss.org/dr14/eboss/spectro/redux/v5_10_0/ v5_10_0/

A version of the above command specific to SEGUE-2:

  rsync --dry-run -aLvz --include "????/" --include "spPlate*.fits" --exclude "*" \
    --prune-empty-dirs --progress \
    rsync://data.sdss.org/dr14/sdss/spectro/redux/104/segue2/ segue2/

This command will download the spectroscopic parameters by plate. If you need stellar parameter data, you need to use:

  rsync --dry-run -aLvz --include "????/" --include "output/" \
    --include "param/" --include "ssppOut*.fit" \
    --include "ssppOut.lineindex*.fit" --exclude "*" \
    --prune-empty-dirs --progress \
    rsync://data.sdss.org/dr14/sdss/sspp/122/ .

Imaging Data

Images and derived catalog data are described on the imaging data page. You can use a SkyServer search or the file window_flist.fits file to identify which RERUN-RUN-CAMCOL-FIELD overlaps your region of interest. Then download the matching calibObj files (catalog data) or frame files (calibrated imaging data), e.g., for RERUN 301, RUN 2505, CAMCOL 3, FIELD 38, the r-band image is:

wget --spider https://data.sdss.org/sas/dr14/eboss/photoObj/frames/301/2505/3/frame-r-002505-3-0038.fits.bz2

and the associated catalog of identified galaxies for that patch of sky is:

wget --spider https://data.sdss.org/sas/dr14/eboss/sweeps/dr13_final/301/calibObj-002505-3-gal.fits.gz

Interferometry Data

The MARVELS data comprises less than 1 TB, and can simply be downloaded recursively from these directories:

For additional assistance with MARVELS data contact the help desk.

Notes on using rsync

Remember, to convert an http URL to an rsync URL you must:

  1. Replace http:// with rsync://.
  2. Remove /sas/.

Here’s a Python function that accomplishes these steps:

def http2rsync(url):
    """Convert a valid SDSS http URL to the rsync equivalent.
    """
    from re import sub
    return sub(r'https?://(data|mirror)\.sdss\.org/sas/dr([0-9]+)/(.*)$',
        r'rsync://\1.sdss.org/dr\2/\3',url)

And here’s the equivalent in IDL:

FUNCTION http2rsync, url
    parts = STREGEX(url,'https?://(data|mirror)\.sdss\.org/sas/dr([0-9]+)/(.*)$',/EXTRACT,/SUBEXPR)
    RETURN, 'rsync://'+parts[1]+'.sdss.org/dr'+parts[2]+'/'+parts[3]
END