FITS Files

Most of the numerical SDSS data is stored in the form of FITS files. These files can contain both images and binary data tables in a well-defined format. FITS files can be read and written with many programming languages, but the most common ones used by SDSS are IDL and Python.

IDL

The Goddard utilities contain tools for reading and writing FITS files. The most commonly used functions are MRDFITS and MWRFITS. The Goddard utilities are included in the idlutils package, which also contains additional programs for manipulating FITS files.

Python

The astropy.io.fits package handles the reading and writing of FITS files in Python. Because of the general usefulness of the astropy package, this is the recommended Python reader for most FITS files.

Another package is fitsio, developed by Erin Sheldon, which is a Python wrapper on the CFITSIO library. It allows direct access to the columns of a FITS binary table which can be useful for reading large fits files, as detailed below. However, fitsio requires that the input files adhere rather strictly to the FITS standard. This package is available for download here.

Large FITS Files

FITS files larger than about 2 GB can be more challenging to read. One such file is the spAll file. The simplest method for reading large FITS files is to download the fitsio Python module described above. The module can read only selected columns from the FITS file:

import fitsio
columns = ['PLATE', 'MJD', 'FIBERID', 'Z', 'ZWARNING', 'Z_ERR']
d = fitsio.read('spAll-v5_10_0.fits', columns=columns)

The astropy.io.fits module has more stringent hardware requirements as it must read the whole file in order to use it. On a 64-bit machine with > 4 GB of memory, it is possible to use the memmap option:

from astropy.io import fits
fx = fits.open('spAll-v5_10_0.fits', memmap=True)
d = fx[1].data

In IDL, the routine HOGG_MRDFITS() is available as part of the idlutils package. This routine is similar to fitsio, in that one can specify a subset of columns to read. It avoids memory overload by reading only a subset of the rows of a FITS file, extracting the columns, then moving on to the next subset of rows.