# Data Model

The primary MaNGA data products are composed of 3-D calibrated data cubes produced by the DRP and 2-D maps of derived quantities produced by the DAP from those cubes. The 3-D data cubes are constructed from a few tens to a few thousands of individual spectra that have been combined onto a regular grid. The 2-D maps of derived quantities are constructed by analyzing individual or binned groups of spaxels and constructing maps of the quantities at the relevant on-sky location. This page explains the structure of the raw data (spectra dispersed onto individual CCDs), intermediate data processed through the 2-D (per-exposure) stage of the DRP, final data products processed through the 3-D (per plate) stage of the DRP, intermediate data products constructed during each stage of the DAP analysis, and the final maps and model cubes consolidated from these DAP analysis steps. Additionally, we describe the pre-imaging data drawn from the NSA reprocessing of the original SDSS imaging survey; this pre-imaging data is used for the astrometric alignment of the MaNGA spectral imaging.

The default storage format for all MaNGA data (images and cubes) is multi-extension FITS (gzipped to save space). The zeroth extension of such files is blank except for the global header. All extensions of such files are labeled with the EXTNAME header keyword so that they can be read by extension name instead of extension number.

## Raw Data

Identifier Data Model

The raw data (identical between MaNGA and eBOSS) is obtained for each of the four cameras individually (b1,b2,r1,r2). Aside from some new keywords, the format is the same as for BOSS raw data. These keywords include:

• Dither position (MGDPOS, values of N,S,E,C). C means no dither; N,S,E are the 3 allowed dither values for the MaNGA dither triangle. (For more details see the MaNGA Observing Strategy paper, Law et al. 2015).
• Dither location (MGDRA, MGDDEC), i.e. offset of the pointing from the nominal center in arcsec of RA and DEC. These are the values actually used by the DRP in determining the astrometric solution.
• PLATETYP='MANGA' or 'APOGEE-2&MANGA', as appropriate (i.e., whether the plate contains only MaNGA holes, or both MaNGA and APOGEE holes for co-observing)
• SRVYMODE='MaNGA stare', 'MaNGA dither', or 'APOGEE-lead' as appropriate. Most galaxy plates are 'MaNGA dither', all-sky plates are 'MaNGA star', and bright-time stellar library plates are 'APOGEE-lead'.

## 2-D Reduction Pipeline Output

Each exposure through the MaNGA instrument is processed separately through the 2-D Data Reduction Pipeline (DRP) up to and including flux calibration and combination of spectra from individual cameras across the dichroic break.

The structure of these files is loosely based on the BOSS spFrame type files.
That is, they are row-stacked spectra (RSS), two dimensional arrays in which each row corresponds to an individual one-dimensional spectrum.
The following files can (and should) be read using the routine ml_mgframeread.pro.

Note that NFIBER=709 for spectrograph 1 (cameras b1 & r1), and 714 for spectrograph 2 (cameras b2 and r2). CCDROW refers to the number of rows on a given CCD after processing through sdssproc.pro to remove the overscan regions from the raw data; this differs between cameras.

Summary Table
Identifier Data Model

List of files:

mgArc-[CAMERA]-[EXPNUM].fits.gz This is the extracted arc frame. It is almost exactly the same format as BOSS spArc files, with the exception of a blank extension 0 and extension names vs numbers.

mgFlat-[CAMERA]-[EXPNUM].fits.gz This is the extracted flatfield frame. It is almost exactly the same format as BOSS spFlat files, with the exception of a blank extension 0 and extension names vs numbers.

mgFrame-[CAMERA]-[EXPNUM].fits.gz The mgFrame files are the extracted fiber spectra for each camera for the science exposures.

mgSFrame-[CAMERA]-[EXPNUM].fits.gz The mgSFrame files are the per-camera science frames after extraction and sky-subtraction has been applied. Note that the 'S' in the name means Sky-Subtracted.

cogimg-[MJD]-[EXPNUM].fits The cogimg files store stacked guider imagers corresponding to the time of observation of each exposure. It is written by coaddgimg.pro for each exposure.

fluxcal-[SPEC]-[EXPNUM].fits.gz The fluxcal files store intermediate products derived during the flux calibration procedure. It is written by mdrp_fluxcal.pro for each spectrograph and each exposure.

mgFFrame-[CAMERA]-[EXPNUM].fits.gzThe mgFFrame files are the per-camera science frame after extraction, sky subtraction, and flux calibration. Note that the 'F' in the name means Flux-calibrated.

mgCFrame-[EXPNUM]-LOG.fits.gz The mgCFrame files are per-exposure combinations of the mgFFrame files by stitching the blue and red channels together across the dichroic break and adding rows for fibers from spectrograph 2 atop those from spectrograph 1 (i.e., in order of fiberid). All spectra in this file have been resampled to a common wavelength grid for ALL MaNGA observations using a cubic b-spline. The MaNGA common wavelength solution is set by ml_setwcalib.pro. There are two versions of this file; the first uses a logarithmic wavelength sampling from log10(lambda/Angstroms)=3.5589 to 4.0151 (NWAVE=4563 spectral elements). Note that 'C' in the name means Calibrated and Camera Combined on a Common wavelength grid).

mgCFrame-[EXPNUM]-LIN.fits.gz As for the LOG format file, but on a linear wavelength solution running from 3622.0 to 10353.0 Angstroms (NWAVE=6732 spectral elements).

## 3-D Reduction Pipeline Output

Once a given plate is complete (ie. all of the required sets of dithers are collected), the 3D stage of the DRP extracts the relevant rows for each IFU from the mgCFrame files, computes the astrometric solution of each, and combines the exposures into row-stacked spectra and data cubes.

These output files follow the naming convention [plate]-[ifudesign], which uniquely identifies a given galaxy observation. Note, however, that if a galaxy were to be reobserved on a different plate it would have a different 'plateifu' identifier. 'plateifu' thus uniquely identifies a set of observations, while 'mangaid' uniquely identifies an astronomical target.

In this section NFIBER is taken to mean the number of fibers in a given IFU (e.g., 19, 37, 61, 91 or 127), and NEXP is the number of exposures.

There are two kinds of output files: row-stacked spectra (RSS) that contain each individual spectrum stacked atop each other into into a 2d format, and a data cube that combines individual spectra together into a rectified 3d data cube. RSS files and cubes are provided for all MaNGA dark time targets except the spectrophotometric standard stars on 7-fiber minibundles.

### HDUCLASS

MaNGA adopts the HDUCLASS FITS header extension keyword structure (see https://heasarc.gsfc.nasa.gov/docs/heasarc/ofwg/docs/ofwg_recomm/r8.html) to indicate the type of information contained in the science, error, and mask (i.e., data quality, or DQ) extensions (and the relationship between those extensions). We define HDUCLASS SDSS.

The science extension has HDUCLAS1=IMAGE (for RSS files) or CUBE (for data cubes) and HDUCLAS2=DATA. The ERRDATA and QUALDATA keywords in this extension header point to the error and DQ extensions.

The error extension has HDUCLAS1=IMAGE or CUBE and HDUCLAS2=ERROR. Valid HDUCLAS3 entries are MSE, RMSE, INVMSE, INVRMSE. MaNGA uses INVMSE (i.e., we provide inverse variance). The SCIDATA and QUALDATA keywords in this header point to the science and DQ extensions.

The DQ extension has HDUCLAS1=IMAGE or CUBE and HDUCLAS2=QUALITY. MaNGA uses HDUCLAS3=FLAG64BIT, indicating that the DQ extension should be interpreted as a bitmask with up to 64 independent bits available. The SCIDATA and ERRDATA keywords in this header point to the science and error extensions.

In all cases, the dimensionality of the ERROR and QUALITY extensions matches that of the DATA extension.

Identifier Data Model

These are the row-stacked, flux-calibrated fiber spectra for a given galaxy across all exposures. The "LOGRSS" file has logarithmic wavelength sampling from log10(lambda/Angstroms)=3.5589 to 4.0151 (NWAVE=4563 spectral elements). The "LINRSS" file has linear wavelength sampling from 3622.0 to 10353.0 Angstroms (NWAVE=6732 spectral elements). Both files contain one row for each fiber, for a total of NFIBER*NEXP rows.

In brief, these files contain extensions for the flux (in units of 10-17 erg/s/cm2/Angstrom/fiber), the inverse variance, the pixel mask, the pre- and post-pixellized spectral line spread function for each fiber, the wavelength vector, the median spectral resolution (pre- and post-pixellization) as a function of wavelength for the fibers in this IFU, the standard deviation of spectral resolution (pre- and post-pixellization) as a function of wavelength for the fibers in this IFU, a binary table describing the individual exposures that make up the file, and arrays of the effective X and Y positions in arcsec of each fiber (as a function of wavelength) relative to the IFU center.

### CUBE Files

Identifier Data Model

These are the final 3d data cubes for a given galaxy that combine all fiber spectra across all exposures as described here.

The LOGCUBE data cube has logarithmic wavelength sampling from log10(lambda/Angstroms)=3.5589 to 4.0151 (NWAVE=4563 spectral elements), and 0.5 arcsec spatial pixels (spaxels) for a total size of NX x NY x NWAVE pixels; LINCUBE is the same, except it has linear wavelength sampling from from 3622.0 to 10353.0 Angstroms (NWAVE=6732 spectral elements).

Similar to the RSS files, they contain extensions for the flux (now in units of 10-17 erg/s/cm2/Angstrom/spaxel), inverse variance, pixel mask, per-spaxel line spread function (pre- and post-pixellized versions), wavelength vector, median spectral resolution and standard deviation thereof (pre- and post-pixellized versions), and a binary table describing the individual exposures that make up the file. Additionally, they also include reconstructed broadband 'griz' images created from the spectral data cube, estimates of the reconstructed point source profiles in each of the 'griz' bands, and estimates of the 2d spatial correlation matrices at 'griz' bands.

## Data Analysis Pipeline Output

The DAP output primarily consists of two output files, the MAPS and model LOGCUBE files, provided for each combination of PLATE-IFU and DAPTYPE.

DAPTYPE

Both the MAPS and model LOGCUBE files provided by the DAP include a DAPTYPE keyword in the file name. The DAPTYPE is a short-hand signifying the method used to analyze the DRP LOGCUBE file. This is a combination of the keywords used to select the spaxel binning approach, the templates used by the stellar-continuum (stellar kinematics) fit, and the templates used during the continuum+emission-line modeling. For DR17, DAPTYPE can have four values:

• SPX-MILESHC-MASTARSSP: Analysis of each individual spaxel with S/N>1 ; spaxels must have a valid continuum fit for an emission-line model to be fit.
• VOR10-MILESHC-MASTARSSP: Spaxels are binned to S/N~10 using the Voronoi binning algorithm (Cappellari & Copin 2003); all binned spectra are treated independently.
• HYB10-MILESHC-MASTARSSP: Stellar-continuum analysis of spectra binned to S/N∼10 for the stellar kinematics (same as VOR10 approach); however, the emission-line measurements are performed on the individual spaxels. Hierarchically clustered MILES templates are used for stellar continuum analysis and simple stellar population models derived from MaStar are used to fit the continuum in the emission-line module.
• HYB10-MILESHC-MASTARHC2: Same as the above except hierarchically clustered templates from MaStar are used to fit the stellar continuum in the emission-line module.

The advantage of the VOR10-MILESHC-MASTARSSP output is that all measurements are performed on exactly the same spectra. The HYB10-MILESHC-MASTARSSP output is meant to allow for greater spatial resolution for the emission-line analysis and avoid limitations of that analysis from tying it to the S/N of the broadband continuum.

### MAPS Files

Identifier Data Model

WARNING: Some MAPS file extensions must be corrected to obtain the astrophysically relevant quantities. See Working with MaNGA Data!

The MAPS files are the primary output file from the DAP and provide 2D "maps" (i.e., images) of DAP measured properties. The shape and WCS of these images identically match that of a single wavelength channel in the corresponding DRP LOGCUBE file.

Most properties are provided in groups of three fits extensions:

• [property]: the measurement value,
• [property]_IVAR: the measurement uncertainty stored as the inverse variance
• [property]_MASK: a corresponding bit mask for each pixel.

Extensions can either be a single 2D image or they can have a series of images that are organized along the third dimension. For the latter, each image is said to be in a specific "channel." For example, each Gaussian-fitted emission-line flux is provided in a single channel in the EMLINE_GFLUX extension. The headers of all the main data extensions include:

• the WCS information,
• the HDUCLASS keyword block,
• the channel (and unit) descriptions for the multichannel extensions,
• the name of the relevant maskbit group --- for the MAPS file, this is always MANGA_DAPPIXMASK,
• the units for any single image or datacube extensions (BUNIT), and
• the DATASUM and CHECKSUM values.

Internally, the DAP performs all spectral fitting on the binned spectra (termed as such even if a bin only contains a single spaxel) after they have been corrected for Galactic extinction (see here). Therefore, e.g., the output emission-line fluxes have been corrected for Galactic extinction. However, the models and binned spectra in the output model cube files (see below) are reverted to their reddened values for direct comparison with the DRP LOGCUBE file.

### Model LOGCUBE Files

Identifier Data Model

The DAP model LOGCUBE files provide the binned spectra and the best-fitting model spectrum for each spectrum that was successfully fit. These files are useful for detailed assessments of the model parameters because they allow you to return to the spectra and compare the model against the data. The DAP fits the spectra in two stages, one to get the stellar kinematics and the second to determine the emission-line properties. The emission-line module (used for all binning schemes) fits both the stellar continuum and the emission lines at the same time, where the stellar kinematics are fixed by the first fit. The exact details of the stellar continuum fit determined by the stellar kinematics and emission-line modules are, therefore, different.

It's important to note that the DAP performs all spectral fitting on the binned spectra (termed as such even if a bin only contains a single spaxel) after they have been corrected for Galactic extinction. Therefore, the output emission-line fluxes have been corrected for Galactic extinction. However, the models and binned spectra in the output LOGCUBE file are reverted to their reddened values for direct comparison with the DRP LOGCUBE file.

Also, for the HYB binning case, the binned spectra provided in the LOGCUBE files are from the Voronoi binning step; however, the emission-line models are fit to the individual spaxels. Therefore, when using the LOGCUBE files for this binning scheme:

• The stellar-continuum fits (e.g., the data in the STELLAR extension) should be compared to the Voronoi binned spectra in the file, but
• the best-fitting model spectra (stellar continuum + gas emission) in the MODEL extension should be compared to the individual spectra from the DRP LOGCUBE file!

An example of how to plot the model LOGCUBE data is provided by the MaNGA Tutorials; the example also demonstrates how to effectively use the provided masks.

With a few exceptions, the model LOGCUBE file extensions typically provide 3D datacubes. The headers of all these main data extensions include:

• the WCS information,
• the HDUCLASS keyword block,
• the name of the relevant maskbit group --- for the model LOGCUBE file, this is always MANGA_DAPSPECMASK,
• the units for the measurements (BUNIT), and
• the DATASUM and CHECKSUM values.

### Reference Files

The DAP analyzes the data via a series of modules, one for each of its primary analysis steps. Each of these modules writes a "reference" file; when re-executed, the DAP will skip the associated analysis step and use the relevant data from the file to continue and complete the remaining steps. The two main DAP output files consolidate and reformat the results from the suite of reference files; however, not all of the properties from the references files are included. These reference files are provided for users that want access to the data not included in the MAPS or model LOGCUBE files.

The names of the reference files follows a concatenation of the analysis keywords used in each analysis steps. The relevant keywords are:

• [rdxqakey]: the reduction assessments made of the DRP cube file,
• [binkey]: the binning method performed on the DRP spectra,
• [sckey]: the stellar-continuum fitting method,
• [elmkey]: the emission-line passband database used to calculate the emission-line moments,
• [elfkey]: the emission-line fitting method, and
• [sikey]: the spectral-index database used to calculate the spectral indices.

The combined [binkey]-[sckey] is currently the same as the DAPTYPE.

The reference files produced by the DAP are as follows:

• Analysis Step: 1
• Identifier: manga-[plate]-[ifudesign]-[rdxqakey].fits.gz
• Data Model: manga-RDXQAKEY
• Python Class: ReductionAssessments
• Analysis Step: 2
• Identifier: manga-[plate]-[ifudesign]-[rdxqakey]-[binkey].fits.gz
• Data Model: manga-RDXQAKEY-BINKEY
• Python Class: SpatiallyBinnedSpectra
• Analysis Step: 3
• Identifier: manga-[plate]-[ifudesign]-[rdxqakey]-[binkey]-[sckey].fits.gz
• Data Model: manga-RDXQAKEY-BINKEY-SCKEY
• Python Class: StellarContinuumModel
• Analysis Step: 4
• Identifier: manga-[plate]-[ifudesign]-[rdxqakey]-[binkey]-[sckey]-[elmkey].fits.gz
• Data Model: manga-RDXQAKEY-BINKEY-SCKEY-ELMKEY
• Python Class: EmissionLineMoments
• Analysis Step: 5
• Identifier: manga-[plate]-[ifudesign]-[rdxqakey]-[binkey]-[sckey]-[elfkey].fits.gz
• Data Model: manga-RDXQAKEY-BINKEY-SCKEY-ELFKEY
• Python Class: EmissionLineModel
• Analysis Step: 6
• Analysis Step: 7

## SDSS Legacy Imaging

Identifier Data Model