In 1988, a team of astrophysicists gathered together for the task of designing a next generation redshift survey - one which would target both galaxies and quasars. In order to achieve the highest level of homogeneity in these two redshift samples, it was concluded that a dedicated imaging survey would be needed from which target galaxies and quasars would be selected, and that imaging and spectroscopy could be done with the same telescope switching between the two observing modes. Substantial improvement beyond existing surveys dictated an increase by a factor of 100 in terms of the number of targets available at the time - in other words, a survey of one million galaxy redshifts. This survey, the Sloan Digital Sky Survey (SDSS) [York et al., 2000], is now underway, having begun standard operations in April 2000, and is planned to last five years. It will eventually cover pi steradians in the North Galactic Cap, plus three smaller regions in the South Galactic Cap. Now, at the end of the SDSS' first year of standard operations, we are pleased to present this early data release (EDR), consisting of 462 square degrees of imaging data and 54,008 spectra of objects selected from within this area.
This is the first substantive public release of data from the SDSS. Release of the future survey data is scheduled to follow this first release in approximately annual installments. The EDR is served over the World Wide Web from the Space Telescope Science Institute1, Fermilab2, the National Astronomical Observatory of Japan (NAOJ)3, and the Max Planck Institute for Astrophysics4. The institutions involved in the survey and the survey funding sources may be found at the end of this paper. An historical account of the various institutional involvements and acknowledgment of the major project contractors may be found in York et al. [2001].
A brief description of the hardware and associated software may be found in York et al. [2000], which is a technical summary of the project. York et al. [2000] serves as an introduction to the SDSS Project Book, which is a full technical description of the survey hardware and software, available on the web5. The as-built instrument parameters are given in Table , and basic characteristics of the data are given in Table . In brief, the survey uses a dedicated 2.5m telescope, located at Apache Point Observatory (APO) in New Mexico, with a 3° field of view. The telescope has two instruments: a CCD imaging camera that takes data in drift-scanning mode, nearly simultaneously in five photometric bands u, g, r, i and z, and a pair of double spectrographs that use fiber optics to simultaneously take spectra of 640 objects selected from the imaging data. The imaging data are taken on nights of pristine conditions (photometric, good seeing, no moon), while spectroscopy is done on those nights that are less than perfect. The data are photometrically calibrated with the aid of an auxiliary 20¢¢ telescope, the Photometric Telescope (PT), at the site. The data are processed through a series of interlocking pipelines which find the objects in the imaging data, measure their properties, apply astrometric and photometric calibrations, select objects for spectroscopic followup, extract and calibrate the spectra, and derive redshifts and spectral types from the spectra.
The data included in the EDR were taken as we commissioned the hardware and software of the survey, and do not all meet our scientific requirements, in particular in image quality (``seeing''), photometric calibration, and target selection. Nevertheless, the data are of excellent quality, and have supported a number of investigations: the discovery of high-redshift quasars ( and references therein); the large-scale distribution of galaxies (); the gravitational lensing masses of galaxies [Fischer et al., 2000] and clusters [Castander et al., 2001,Sheldon et al., 2001]; the luminosities and colors of galaxies [Blanton et al., 2001a,Shimasaku et al., 2001,Strateva et al., 2001]; the structure of the Milky Way [Ivezi\'c et al., 2000,Yanny et al., 2000,Chen et al., 2001]; the discovery of brown dwarfs ( and references therein); the structure of the asteroid belt [Ivezi\'c et al., 2001]; and many other results as well. An up-to-date, complete list of SDSS science publications may be found at http://www.sdss.org/science/pubs.html. Our aim in this paper is to describe the data in enough detail to allow the community to reproduce the results of these papers, and carry out further investigations with them.
The outline of this paper is as follows: § 2 describes the scope of the EDR, the basic data formats, and the way in which the data will be distributed to the astronomical community. § 3 describes the hardware of the project, emphasizing those characteristics that are necessary to understand the strengths and flaws of the data. § 4 describes the pipelines used to reduce the data, with emphasis on the nature of the scientifically useful outputs. We conclude in § 5.
Finally, a comment on notation. As we describe in § 4.5, our photometric calibration remains uncertain, due in part to differing filter curves on the 2.5m and the PT. The original filter system, and the AB system based on it as defined by Fukugita et al. [1996], is referred to here as u¢g¢r¢i¢z¢; this is close to that realized on the PT. The 2.5m filters themselves are referred to as ugriz, while the still-preliminary 2.5m-based photometry will be called u*g*r*i*z*.
The EDR contains 462 square degrees of imaging data in five bands, and 54,008 spectra in that same area. The data were acquired in three regions: along the celestial equator in the Southern Galactic sky; along the celestial equator in the Northern Galactic sky; and in a region overlapping the SIRTF First Look Survey.
Table summarizes the imaging data included in the EDR. The ``run number'' is a designation we use for one continuous scan of the SDSS imaging camera on the sky, and a ``stripe'' is the great circle covered by a run, 2.5° wide. We cover each ``stripe'' in two ``strips'', separated in the north/south direction so that the interleaved scans of the six columns of the imaging camera completely cover the ``stripe.'' We define the great circles for the imaging survey in section 3.2.2. The location of each run and the effective area covered are indicated in the table. Runs 94/125 and 752/756 are long stripes on the equator, in the Southern and Northern Galactic caps, respectively, while runs 1336/1339 and 1356/1359 are shorter scans, off the equator, designed to overlap with the SIRTF First Look Survey. The sky coverage of the resulting imaging and spectroscopic data are illustrated in Figures 2.1 and 2.1.
|
|
|
|
Table summarizes the spectroscopic data. As discussed in section 4.8, we select objects detected in the imaging data for spectroscopic observations. The nominal exposure time for each plate is 45 minutes, which typically yields a signal-to-noise ratio of 4.5 per pixel for objects with a g* magnitude of 20.2. The measured signal-to-noise ratio at g* = 20.2 for each plate is included in Table . For completeness, we list several plates which were designed but not unobserved; they have no entry in the (S/N)2 column.
The overall quality of the EDR is summarized briefly in Table .
Figure summarizes the data products for imaging and spectroscopy, the three database servers we use to present these products to the astronomical community, and the user interfaces we have developed to help astronomers work with the data effectively. The data products described below are:
|
| Figure 3: Overview of data products and distribution for SDSS EDR. The left column contains all of the data products available. The second column contains the servers that hold data. Note that not all data products are contained in each of the servers. The third column contains the interfaces we provide to these servers. Choose which interface to use based on the results you wish to obtain, listed in the fourth column. |
We organize our measured parameters, for imaging and spectroscopic data, by grouping related parameters into classes. Table lists the classes. Tables , , , , , , , , and list the parameters in each class with a brief description.
In each table, we list the parameter name (in many cases, parameters have several synonyms, as listed), the datatype, a brief description, and, for Table , an indication of whether this is a tagged entry (a parameter for which searches are particularly fast; see § 2.4.1). For binary flags, the meaning of the bits are given explicitly. Parameters for which there exists a placeholder in the database, but are not yet calculated, are indicated as such with the phrase ``(placeholder)'' in the table.
Many quantities are defined for each band, and thus have five entries. In the Catalog Archive Server (§ 2.3.1) these are indexed with 0, 1, 2, 3, 4 inside of square brackets, [], for u,g,r,i,z, respectively. In the skyServer (§ 2.3.3) these are indexed with _u, _g, _r, _i, _z for the filters.
Some parameters are associations to objects in other classes. For example, an object in the class SpecObj (Table ) has a parameter plate, which is an object in the Plate class (Table ). This contains information common to the 640 spectra taken in one set of observations for the plate. The association is indicated in the table in the type column as OneAssoc(Plate). Other associations point to many (rather than one) objects. In SpecObj, for example, the parameter emissionredshift has the type ManyAssoc(EmissionRedshift). It points to a list of redshifts measured. The ``best'' value of the emission and absorption redshift is stored in z of SpecObj, but for complex spectra, you may access the full list of redshifts measured during processing.
Image Parameters
The results of the imaging pipelines (described below in § 4.4) are summarized in Table , and the apertures used for radial profile measurements are in Table . For each object, we measure the position, several varieties of the flux, morphological parameters, and a provisional classification. We also include informational flags on each pixel in the images (Table ) and each object (Table ) and summary flags and statistics for each field (Table ). The choice of flags and which measure of flux and shape to use are dictated by the science goals of the SDSS. The catalog contains over 120 parameters and flags measured for 13,804,448 objects. 10,947,783 of these are unique, well-measured astrophysical objects.
The SDSS object catalogs are matched against the FIRST [Becker, White, & Helfand, 1995], ROSAT [Voges et al., 1999], and USNOA2.0 [Monet, 1998] databases; the resulting parameters are described in Table .
Images
Each corrected frame is a FITS image for one filter, 2048 columns × 1489 rows, with row number increasing in the scan direction. These are the imaging frames with flat field, bias, cosmic ray, and pixel defect corrections applied. A raw image contains 1361 rows, and a corrected frame has the first 128 rows of the following corrected frame appended to it. The pixels subtend 0.396¢¢ square on the sky.
Header information using the World Coordinate System (WCS) [Calabretta & Greisen, 2001] allows standard astronomical FITS tools to convert pixel position to (a,d) (§ 4.2.2).
Binned Images
Each file is a FITS image for one filter, 512 ×372 pixels, with WCS information. These are the corrected frames with detected objects removed and binned 4 ×4 pixels.
Mask Frame
Each file is a binary FITS table for one filter. Each row of the table describes a set of pixels in the corrected frame, using mask values described in Table .
Atlas Image
For each detected object, the atlas image comprises the pixels that were detected as part of the object in any filter. These are provided through a database, as either a JPEG color image, or as a FITS file for each selected filter.
Color Image
We combine the corrected frames from the g, r and i filters and produce a color image with the filters corresponding to blue, green, and red, respectively. The intensity mapping of each color is adjusted to enhance the appearance of these images. We use the same mapping for all of the color images.
Most of these files are simple FITS binary tables or images; The exceptions are Atlas Images (fpAtlas), Mask Frames (fpM), and PSF description files (fpField) files. We provide stand-alone code on our web site to enable you to interpret these files and read them into your own code.
Calibrated Spectra
These are the (l, flux) pairs that comprise the spectrum. The spectra are given in vacuum wavelengths in the heliocentric frame, with flux density given in units of 10-17 erg s-1 cm-2 Å-1. An estimated error is also output, as well as a mask associated with each pixel, as described in Table .
Spectroscopic Parameters
Catalogs produced by the spectroscopic pipelines (described below in § 4.10) are summarized in Tables and . For each spectrum, we measure the redshift using several techniques, locate and characterize lines, and assign an identification. The catalog contains 54,008 spectra, with 46 parameters measured for each spectrum, and 34 parameters measured for each emission line identified in each spectrum.
Images of the Spectra
For convenience, we also provide a plot of each spectrum. These are GIF images of the spectrum, with significant features, our classification, and measured redshift indicated.
Table summarizes parameters that record the number of objects loaded in the database. Table summarizes data that describe the imaging data. A run, as already described, is all the data from a single contiguous scan of the imaging camera. These are combined into chunks, contiguous areas of the sky from which spectroscopic targets will be selected and spectroscopic tiling (the process by which targets are assigned to spectroscopic plates) will be done. A segment is a single piece of a run for a single camera column. The contiguous stream of data is divided into a series of fields (§ 3.5), whose detailed properties are also given in Table , including astrometric calibrations (§ 4.2), and details of the point-spread-function fitting (§ 4.3). Finally, the details for each spectroscopic plate are also given in Table .
Table lists constants we use to define the survey.
We have constructed three database servers to hold the imaging and spectroscopic data for the EDR, as shown in Figure . The Catalog Archive Server contains the measured parameters from all objects in the imaging survey and the spectroscopic survey. These are loaded in a database server we built using Objectivity, a commercial, object-oriented database server [Objectivity Inc., 2001]. The skyServer contains identical information, but it is loaded in a database server we built using a commercial, relational database server, SQLServer [Delaney, 2001]. The Data Archive Server contains the rest of the data products for the EDR, such as the corrected imaging frames, which are available for direct download, and the calibrated spectra, which are loaded in an Objectivity database.
The Catalog Archive Server is indexed to allow efficient queries on quantities commonly used in astronomical research (position, and magnitudes in our five filters). It also allows algebraic combinations of these quantities to be used in the queries. We have written a unique and powerful database server which allows users to make sophisticated queries efficiently. The general strategy of compiling a large astronomical database is described in Szalay et al. [2001], and our method of dividing the celestial sphere into a hierarchical triangular mesh, which enables efficient access based on position, is described in Kunszt et al. [2000]. The database server software is described in Thakar et al. [2000].
Given a set of object coordinates, either from the catalog archive server or some other source, the data archive server makes the detailed data (corrected frames, binned images, mask images, atlas images, color images, spectra, and spectral plots) available. It is not practical to access all of the imaging data in this way, but it does give convenient access to any selected field or spectrum contained in the EDR.
The skyServer is a relational database server. Its language is not as rich as that provided by the catalog archive server, but it works well for many queries. This server was originally developed with outreach and education in mind, but it is also very useful for astronomical research.
There are three separate user interfaces to the SDSS EDR, as shown in Figure .
We expect that typical users interested in extracting subsets of our imaging catalog based on position, flux, and colors, will find the MAST interface to the skyServer most useful. Sophisticated queries on flags and algebraic combinations of parameters are best done with the SDSS Query Tool (sdssQT). Users who wish to download atlas images, corrected frames, or spectra, should do so with the MAST interface to the data archive server.
The sdssQT is a stand-alone application to manage and perform catalog archive server queries, and is available for download from our web sites, along with a detailed user's manual. The discussion that follows is a brief introduction to its use, to illustrate its capabilities. There are options to choose the output format (ASCII or binary), facilities to save queries, a simple text editor to create and modify queries, and communication with the Catalog Archive Server to follow the progress (and predict the time to completion) of active queries. An additional tool converts our binary output format to a FITS binary table.
The query language we developed is very similar to the Structured Query Language (SQL). The sdssQT includes several example queries, and the online user's guide provides additional explanations of the language and how to use it efficiently. The use of associations (§ 2.2.1) provides a powerful way to extract object data from many different classes simultaneously. Similarly, the inheritance properties of classes and their subclasses makes queries written for a given class run on all of its subclasses or sibling classes.
The grammar of this language is to select a set of parameters from a class that satisfy specified conditions. The SDSS Query Tool allows full access to all of the classes and parameters in the Catalog Archive Server. For example, the query:
SELECT ra,dec FROM SpecObj WHERE (z > 2)returns the parameters ra and dec (right ascension and declination) for all spectroscopic objects (i.e., from the class SpecObj) with redshift greater than 2.
One class must appear in the WHERE clause. One or more of the parameters from the classes may be listed in the SELECT clause and used in the WHERE clause, as long as they are included in the class mentioned in the FROM clause.
The PhotoObj class, containing the detected objects in the images, has the most entries, over 13 million. To facilitate selecting from these entries, we have designated a subset of the most commonly used parameters in the PhotoObj class to be part of a special class called Tag. The final column of Table indicates which parameters are in the Tag class. The database is structured in such a way that searches that select on these Tag parameters run significantly faster.
Additionally, many of the classes, like Tag, have numerous subclasses (Primary, Secondary, Galaxy, Star), described in the above mentioned tables. These subclasses all inherit the properties of the umbrella class. The same parameters are available in the subclasses as in the umbrella class, but they will be faster to query as each contains only a portion of the total objects. For example, users wanting data only on galaxies can execute queries on the Galaxy class; this avoids having to specify that objType=Galaxy if it were run on the entire Tag class, and will run faster as fewer objects are searched.
To access associated, or linked parameters, we use the ``.'' modifier. For example, spectroscopic objects from the SpecObj class all have a link back to the photometric object in the Tag class. This allows retrieval of parameters from both classes simultaneously. Use the syntax tag.r, for example, to obtain a spectroscopic target's r* magnitude. In Table , the parameter phototag has type OneAssoc(PhotoTag), so the query
SELECT ra,dec,phototag.r FROM SpecObjreturns the position and r* magnitude for all objects with spectra.
Some associations tie many objects in a class to one object of another class. For example, there can be many lines measured in one spectrum. The parameter measured in the SpecObj class has the type ManyAssoc(SpecLine). The following Association Query returns parameters for all of the lines of the selected spectrum:
SELECT name.name,wave,ew FROM (SELECT measured FROM SpecObj WHERE plate.plateID == 384 && fiberID == 284)
The online user's guide to the SDSS Query tool contains a detailed description of the language, examples to help construct advanced queries, and details about macros to perform logical and arithmetic operations within queries.
This provides a familiar interface to the object catalog, as it is based on the MAST interface6 which also serves catalogs from NASA missions and other surveys: GSC, DSS, and VLA-FIRST. This is the interface most will prefer for straight-forward queries based on (a,d). Objects that satisfy the query can be used one of several ways. The list may also be written to a comma-delimited text file, to be read by external programs, such as Microsoft Excel. The list may be written as a FITS binary table. Finally, the list may be browsed as an HTML table, where specific objects and data products (e.g. the corrected frame for the object, or the spectra for those objects with spectra) are then selected for use by the Data Archive Server.
The SDSS Image Products interface accepts a list of objects specified in one of the following formats: run/rerun/camcol/fieldID/objectID (see § 4.4.1), long object ID (from the skyServer or Catalog Archive Server), ra/dec, NED, or SIMBAD target names. Files generated by the sdssQT may be uploaded directly. All imaging products (atlas images, corrected frames, reconstructed frames, binned images, fields summaries, and mask files) are available. All of the files for a request are bundled together in a .tar, .tar.gz, or .zip file.
An object list from the MAST skyServer Interface can be sent to the data product interfaces via a ``shopping cart.'' To demonstrate this access, we begin in the MAST SDSS skyServer Interface. Select galaxies within 10 arcmin of (a,d)=(0,0) and 16 < r* < 17 by entering the coordinates in the RA,DEC boxes, 10.0 in the Radius(arcmin) box, highlighting GALAXY from the Object Type list, and by entering 16..17 in the Magnitude box for r. Three objects are returned. Choosing the ``Browse Results as HTML'' button brings up a list of these objects. Select the three objects by using the button in the Mark column, select Add marked records to shopping cart, and then select Retrieve data products for shopping cart. The data products may then be downloaded.
Data products are also available directly from the data archive server. Sophisticated users may want to download some of these files directly, but we consider these files to be intermediate data products, and focus our support and documentation efforts on the database servers and interfaces we described above. The data model, which describes the file naming conventions and FITS headers, is available on our web site.
Image data are available from the main Data Archive Server page. Data for individual runs are in subdirectories specified by $run/$rerun, where valid $run/$rerun combinations for the EDR are 94/7,125/7, 752/8, 756/8, 1336/2, 1339/2, 1356/2, and 1359/3. Corrected frames are accessed under this root URL via
corr/$camCol/fpC-$run-$rerun-$filter$camCol-$field.fit.gz,where $camCol is the camera column (1 through 6), $run is the run number (6 digits with leading zeroes), $filter is the filter name (u, g, r, i, z), and $field is the field number (4 digits with leading zeroes). Binned images are accessed in a similar fashion, via
objcs/$camCol/fpBIN-$run-$filter$camCol-$field.fit.gz,as are the Mask images, via
objcs/$camCol/fpM-$run-$filter$camCol-$field.fit
The fpAtlas files in that same directory contain all of the atlas images for each field, but it is generally more convenient to access them with the Data Archive Server.
The catalogs of detected objects for each field are accessed via
objcs/$camCol/fpObjc-$run-$camCol-$field.fitbut these files do not yet include photometric or astrometric calibration.
Summary information for each field processed is accessed via
objcs/$camCol/fpFieldStat-$run-$camCol-$field.fitand results of the psp pipeline, which characterizes the point spread function for a field (§ 4.3), are accessed via
objcs/$camCol/fpField-$run-$camCol-$field.fit
Spectroscopy files are available from the main Data Archive Server page. Data for individual plates are in subdirectories specified by $plate/$mjd, where valid $plate/$mjd combinations are in Table .
Calibrated spectra which have been extracted from the spectroscopic frames with the spectro2d pipeline (§ 4.10.1) are available under the root URL via
2d_10/$plate/2dmerge/spMerge2d-$mjd-$plate-$fiber.fit,where $plate is the plate number (four digits, zero padded), $mjd is the date observations were completed (Table ), and $fiber is the fiber number (three digits, zero padded).
Calibrated spectra, with information from the spectro1d pipeline (§ 4.10.2), such as redshift and identification, are available under this root URL via
1d_10/$plate/1d/spSpec-$mjd-$plate-$fiber.fit.
We provide help for users at several levels. This paper is a general introduction to the data products. Our web sites have additional documentation. We maintain a set of frequently asked questions with answers to help people become more familiar with how to use the data set. There are also two mailing lists which we encourage users to join. The first is a SDSS Early Data Release Users' Group which we will use to communicate periodically with users. The second is the SDSS Early Data Release Users' Forum, where we encourage all users to post questions and suggestions. We will monitor this mailing list and post answers, or pointers to online documentation, as appropriate. We also support a help desk, which is accessible via email and telephone, as detailed on our web sites.
Access to the SDSS EDR is open. These data may be used for any astronomical research. Data from the SDSS public archive may not be used for any commercial publication or other commercial purpose except with explicit approval by the Astrophysical Research Consortium.
Non-commercial and technical publications based on the SDSS EDR should contain the following acknowledgement:
``Funding for the creation and distribution of the SDSS Archive has been provided by the Alfred P. Sloan Foundation, the Participating Institutions, the National Aeronautics and Space Administration, the National Science Foundation, the U.S. Department of Energy, the Japanese Monbukagakusho, and the Max Planck Society. The SDSS Web site is http://www.sdss.org/.
The Participating Institutions are The University of Chicago, Fermilab, the Institute for Advanced Study, the Japan Participation Group, The Johns Hopkins University, the Max-Planck-Institute for Astronomy (MPIA), the Max-Planck-Institute for Astrophysics (MPA), New Mexico State University, Princeton University, the United States Naval Observatory, and the University of Washington.''
The telescope and its somewhat unusual optical design [Waddell et al., 1998] are described in detail in York et al. [2000] and the SDSS Project Book, available on the web7. The camera and the pair of double spectrographs are discussed there as well; see also Gunn et al. [1998]. We review here only details which are of direct relevance to working with the data.
The parameters of direct relevance to the imaging system of the SDSS are presented in Table . Since York et al. [2000] was published, we have learned a great deal about our filter and detector system, and a preliminary discussion of these matters is given below, as are some details concerning the geometry of the great-circle Time Delay Integration (TDI, or Drift Scan) stripes which cover the survey area.
The situation with the SDSS response functions for the five filter passbands, and the resulting photometric system, is rather complex. There is a set of primary standard stars which have been measured at the U. S. Naval Observatory 40¢¢ telescope and with the SDSS Photometric Telescope (§ 3.4) which together define a photometric system which we believe to be self-consistent to approximately 1%; this system is roughly as described in Fukugita et al. [1996]. These primary standards are described further in § 4.5 below. Unfortunately, the filters used on the USNO and PT telescopes differ systematically from those on the 2.5m camera, and we still do not have a complete understanding of the transformations between these two systems8. Thus the photometric system defined by the USNO and PT telescopes is not directly applicable to the 2.5m data, as described in detail in § 4.5. Figure 3.2.1 gives the average measured quantum efficiencies of the 2.5m camera detectors multiplied by the reflectivity of the primary and secondary (the two transmissive surfaces have negligible effect on the throughput); curves are given both assuming no atmosphere, and including the transmission of the atmosphere above Apache Point on a night of average humidity at airmass 1.3. Tables containing the system response in each filter are available on our web sites. The thinned CCDs also suffer from internal scattering that scatters light longward of roughly 6000Å into an extended halo around an object; this decreases the effective quantum efficiency for a point source. For extended sources (size > 30¢¢), this effect is negligible, and the dashed curves indicate the quantum efficiency in this case in the r and i filters. The z chip is thick, and does not suffer from this problem.
The camera responses were measured by an instrument with a roughly triangular wavelength response with FWHM about 100Å; this resolution has not been corrected for in these data but does not appreciably alter the shapes. Better and more detailed response data will be obtained and published later, but the results here are adequate for most purposes.
Table gives corresponding properties of
the filters, updating those tabulated in Fukugita et al. [1996] and
Fan et al. [2001a]: the effective wavelength of each filter leff, the
photon-weighted mean of the quantity ln2(l/leff)
(a measure of the effective width of the filter),
the Full Width at Half Maximum of the filter, and Q, the integral of the
system efficiency over d(lnl), effective quantum
efficiency (all assuming 1.3 airmass, and observing a point source). This
last quantity relates the measured apparent magnitude (on an AB
system) to the number of detected electrons:
| (1) |
As pointed out in § 1, we refer to data on the standards system with the magnitude labels (u¢g¢r¢i¢z¢) and the provisional 2.5-meter magnitudes with the labels (u* g* r* i* z*). The SDSS photometry itself is presented in the provisional 2.5-meter system. Finally, the 2.5-meter filters themselves are referred to in this paper simply as u, g, r,i, and z, a change from some of our earlier papers.
The survey coordinate system (l,h) is a spherical coordinate system with poles at a2000 = 95\arcdeg, d2000 = 0\arcdeg and a2000 = 275\arcdeg, d2000 = 0\arcdeg. The survey equator is thus a great circle perpendicular to the J2000 celestial equator, intersecting it at a2000 = 185\arcdeg and a2000 = 5\arcdeg. Lines of constant h are great circles perpendicular to the survey equator and lines of constant l are small circles parallel to the survey equator. l = 0\arcdeg, h = 0\arcdeg is located at a2000 = 185\arcdeg, d2000 = 32.5\arcdeg, with h increasing northward.
The survey area is divided into stripes, where each stripe is centered
along a line of constant h, separated from the adjoining
stripe(s) by 2.5 \arcdeg. Each drift scan tracks a survey stripe,
offset by ±386 arcsec perpendicular to the stripe. Two scans (or
``strips''), one offset to the north and one to the south, are
required to fill a stripe.
The survey latitude tracked by stripe
n is given by
| (2) |
| (3) |
The natural coordinate system to use for processing a given drift scan is the great circle coordinate system for that stripe, (m,n), in which the equator of the coordinate system is the great circle tracked by the scan. This great circle is inclined by i = h+ 32.5 \arcdeg to the J2000 celestial equator, with an ascending node of 95 \arcdeg. m = a at the ascending node. m increases in the scan direction (east) and n increases to the north. Each stripe has its own great circle coordinate system.
For reference, the equations to transform among the different coordinate systems are:
|
We produce 640 individual spectra in a three degree diameter field at a resolution R º l/Dl of about 1800 in the wavelength range of 3800 to 9200 Å. This wavelength range is divided between two cameras by a dichroic at about 6150 Å, and there are two spectrographs, each producing 320 spectra. There are thus 4 CCD detectors, each of the same kind as are present in the g, r, and i bands in the camera, 2048 pixels square with 24 micron pixels. The spectroscopic system is discussed in York et al. [2000]. Results from commissioning the system are discussed in Castander et al. [2001].
The fibers carrying the light from the drilled plug-plates to the spectrographs subtend about 3 arcseconds in the focal plane, and are imaged in turn in the spectrograph cameras with a footprint of about 3 pixels. The straight-through transmissive immersion grisms produce a dispersion which is roughly linear in log wavelength. The spectrographs are very efficient; quantum efficiencies on the sky as measured from standard stars as a function of wavelength for each of the four spectrographic CCDs are presented in Figure 3.3. They peak at over 25% in the red, and just under 20% in the blue.
The nominal exposure time for each plate is 45 minutes, split into at least three parts for cosmic ray rejection, with the exact number determined by observing conditions. This set of science exposures is preceded and followed by a series of shorter exposures for calibration: arcs, flat-fields, and a 4-minute smear exposure on the sky for spectrophotometric calibration, in which the telescope is moved so that the 3 arcsec fiber on each object effectively covers a 5¢¢×8¢¢ aperture, aligned with the parallactic angle. The smear exposures allow us to account for object light excluded from the 3 arcsec fibers due to seeing and atmospheric refraction; they provide an accurate (albeit low signal-to-noise ratio; S/N) measure of the true spectral shape of the objects and are used for spectrophotometric calibration9. The calibration and science exposures are immediately processed through a streamlined version of the 2d spectroscopic pipeline (§ 4.10) to inform the observers whether the calibrations were successful and to provide S/N diagnostics on the science exposures.
For each science exposure, the (S/N)2 per pixel through the SDSS imaging passbands is measured and evaluated as a function of fiber magnitude for each spectrograph camera. We take repeated 15-minute exposures until the cumulative median (S/N)2 > 15 at g*=20.2 and i*=19.9 in all 4 cameras. In clear, non-moony conditions, the (S/N)2 threshold is easily reached in 3 exposures, and we never take fewer than three exposures; in (partially) cloudy or moony conditions, more exposures may be required.
We use a 20-inch Photometric Telescope, located next to the 2.5-meter telescope's enclosure, to measure nightly extinctions and to observe transfer fields (secondary patches) that in turn are used to calibrate the 2.5-meter imaging data. Details of the full photometric calibration process can be found in § 4.5. This telescope is a commercial reflector built by DFM Engineering (Longmont, Colorado), modified to incorporate improved baffling and a coma corrector to increase its field of view. It is equipped with a thinned SITe 2048×2048 chip with 24 micron square pixels like the u chips in the 2.5m camera, and a set of filters nominally the same as those in the camera. Please refer to § 3.2.1.
The telescope operates automatically, observing primary standard stars and secondary patch transfer fields selected from an on-line database. Observing staff can monitor progress with real-time tools that display cloud cover, extinction coefficients, and observing progress [Hogg et al., 2001].
The data acquisition system [Petravick et al., 1994] records information from the imaging camera, spectrographs, and photometric telescope. Data are transferred via magnetic tape, with critical, low-volume samples sent over the internet. Each system uses report files to track the observations.
Data from the imaging camera are collected in the Time Delay Integration (TDI, or Drift Scan) mode. We treat the data from each imaging camera column of 5 photometric and 2 astrometric CCDs as a scan line. For convenience, data from each CCD are broken into frames containing 1361 lines. Before processing, the 128 rows from the next frame are added to the top of each frame, so that the pipelines work on 2048×1489 images. The resulting overlap between reduced frames (128 rows) is roughly the same as the number of columns that overlap with the other strip of a stripe. Some objects are detected in more than one frame, but when loading the databases we mark one of these detections as the ``primary'' detection (see the discussion in § 4.7 below). The frames which correspond to the same sky location in each of the five filters are grouped together for processing as a field. Frames from the astrometric CCDs are not saved, but rather stars from them are detected and measured in real time to provide feedback on telescope tracking and focus. These measurements are also written to magnetic tape. This same analysis is done for the photometric CCDs, and we save these results along with the actual frames. Each night, a special bias run is taken to monitor the bias levels on CCD amplifiers.
Data from the spectrographs are read from the four CCDs (one red channel and one blue channel in each of the two spectrographs) after each exposure. A complete set of exposures includes bias, flat, arc, smear, and science exposures taken through the fibers, as well as a uniformly illuminated flat to take out pixel-to-pixel variations.
Data from the photometric telescope include bias frames, dome and twilight flats for each filter, measurements of primary standards in each filter, and measurements of our secondary calibration patches in each filter.
All of these systems are supported by a common set of observers programs, with observer interfaces customized for each system to optimize our observing efficiency.
Data from APO are transferred to Fermilab for processing and calibration. Three flavors of data are produced at APO: data from the imaging camera, data from the Photometric Telescope, and spectra from the spectrographs. Imaging data are processed with the imaging pipelines: the astrometric pipeline (astrom; § 4.2) performs the astrometric calibration; the postage stamp pipeline (psp; § 4.3) characterizes the behavior of the point spread function as a function of time and location in the focal plane; the frames pipeline (frames; § 4.4) finds, deblends, and measures the properties of objects; and the final calibration pipeline (nfcalib; § 4.5.3) applies the photometric calibration to the objects. This calibration uses the results of the photometric telescope data processed with the monitor telescope pipeline (mtpipe; § 4.5.2). The combination of the psp and frames pipelines is sometimes referred to as photo.
Individual imaging runs which interleave are prepared for spectroscopy with the following steps: resolve (§ 4.7) selects a primary detection for objects which fall in an overlap area; the target selection pipeline (target; § 4.8) selects objects for spectroscopic observation; and the plate pipeline (plate; § 4.9) specifies the locations of the plates on the sky, and the location of holes to be drilled in each plate.
Spectroscopic data are first extracted and calibrated with the 2d pipeline (spectro2d; § 4.10.1), and then classified and measured with the 1d pipeline (spectro1d; § 4.10.2).
The EDR was prepared using the versions of pipelines indicated in Table . The data for the EDR were reduced with a consistent set of pipeline versions, with only minor version changes to address operational issues.
We continue to develop these pipelines, and in what follows, we describe known problems and future developments where relevant. We also continue to improve photometric calibration techniques (§ 4.5).
A separate great circle coordinate system is defined for each stripe (§ 3.2.2). In these systems, the stripe center is the equivalent of the equator in the equatorial (a,d) system. Pixel coordinates are corrected for empirically derived optical distortion terms, and the resulting mapping from corrected CCD row and column pixel positions to great circle longitude and latitude is linear to very good approximation. Astrometric solutions are carried out in this coordinate system. One of two reduction strategies is employed depending upon the coverage of astrometric catalogs:
For each r frame, these mappings result in an affine transformation relating corrected pixel positions to celestial coordinates. A secondary catalog is produced from the detections on the r CCDs. This secondary catalog is then matched to centroid positions on the i, u, z, and g CCDs to derive affine transformations in those filters. The transformation also includes terms to correct for differential chromatic refraction and those terms are applied when the colors of objects are known (Table ). Positions of detected objects given in this EDR have had this correction applied.
The relative astrometry between the r and the i, u, z, and g CCDs is independent of the astrometric catalog used, and typically has rms errors of 20 to 30 mas per coordinate, and systematic errors of order 20 mas. The quality of the absolute astrometry (based on the r astrometric solutions and centroids) is dependent on the astrometric catalog used, and is dominated by systematic errors which vary on timescales of minutes. Within a given run, the distribution of systematic errors is well characterized by a Gaussian. Reductions against UCAC have rms systematic errors of order 50 mas per coordinate. Reductions against Tycho-2 have rms systematic errors of order 100 mas, and show additional systematic errors constant over entire scans of up to 50 mas. Centroiding errors contribute an additional random source of error, of order 20 mas, for objects brighter than r* = 20. Comparison with the astrometry of the Two-Micron All Sky Survey for stars in common shows systematic offsets under 50 mas, well within our quoted errors and the expected systematic astrometric calibration effects quoted by the 2MASS team.
Atmospheric conditions contribute significantly to image wander. These affect the Tycho-2 reductions more than the UCAC reductions due to the shorter integration times on the astrometric CCDs. We attempt to follow this wander by fitting the residuals with cubic splines. Figure 4.2.1 shows the astrometric residuals as a function of frame number for a typical run. The spline-fitted solution is superposed on the points showing the residuals. The top plot shows m residuals (along the direction of the scan), and the bottom plot shows n residuals (the cross-scan direction). Approximately 100 frames per hour are obtained, so the figure shows roughly half an hour of scanning. Note that the residuals wander several tenths of an arcsecond over time scales of minutes. The frequency and amplitude of these wanderings vary from night to night (and, occasionally, hour to hour).
Present WCS proposed standards [Calabretta & Greisen, 2001] do not fully support a rigorous transformation from great circle to equatorial coordinates. As a result, the WCS representation does not reflect the full accuracy of the astrometric solution, but the accuracy is better than one pixel (about 0.4¢¢) within a frame.
The conversion from (row,col) measured in a field to (a,d)(J2000
degrees) is
|
As mentioned above, the data stream from each CCD is divided into an overlapping series of 10¢×13.5¢ frames, for ease of processing; the frames pipeline (§ 4.4) will process these separately. However, in order to ensure continuity along the data stream, certain quantities need to be determined on timescales up to the length of the imaging run. The astrometric and photometric calibrations certainly fall into that category; in addition, the Postage Stamp Pipeline (psp) calculates a global sky for a field, flatfield vector, bias level, and the point-spread function (PSF).
Even in the absence of atmospheric inhomogeneities, the SDSS telescope delivers images whose FWHMs vary by up to 15% from one side of a CCD to the other; the worst effects are seen in the chips furthest from the optical axis. Moreover, since the atmospheric seeing varies with time, the delivered image quality is a complex two-dimensional function even on the scale of a single frame. An example of the instantaneous image quality across the imaging camera is shown in Figure 4.3, where each rectangle represents one chip10.
The description of the point spread function (PSF) is critical for accurate PSF photometry, for star/galaxy separation, and for unbiased measures of the shapes of non-stellar objects; we need to map the full variation of the PSF even on scales of a single frame. The SDSS imaging PSF is modeled heuristically in each band using a Karhunen-Loève (KL) transform [Lupton et al., 2001c]. In particular, using stars brighter than roughly 20th magnitude, we expand the PSF from a series of five frames into eigen-images, and keep the first three terms. We fit the variation of the coefficients multiplying these terms to second order in position across the chip, using data from the frame in question, plus the immediately preceding and following half-frames.
The success of this KL expansion is gauged by comparing PSF photometry based on the modeled KL PSFs to large aperture photometry for the same (bright) stars. The width of the distribution of these differences is typically 1% or less, which is thus an upper limit on the accuracy of the PSF photometry (not including calibration problems; see § 4.5). Without accounting for the spatial variation of the PSF across the image, the photometric errors would be as high as 15%. We have recently found a subtle dependence of the PSF width on stellar color in the g band; this affects PSF photometry at the < 2% level, and will be addressed in future data releases.
Parameters that characterize one frame of imaging data are stored in the class Field (Table ). The status parameter flag for each frame indicates the success of the KL decomposition; its possible values are given in Table . In particular, if the data do not support the fitting of a second-order term to the variation of the coefficients with position, a linear fit is carried out, and status is set to 1. If even this is not warranted by the data, the coefficients are set to be constants, and the status flag is set to 2. Finally, if no PSF stars are available at all, the PSF model is set to that of the previous frame, and status = 3. A more quantitative measure of the accuracy of the PSF fit on a given frame is given by the scatter in the difference between PSF magnitudes and aperture magnitudes, as reported in psfApCorrectionErr. Note that the actual KL values and the eigenshapes are not reported in the tables, so the shape of the PSF as a function of position within a CCD cannot be reconstructed based on the parameters included in this EDR.
The psp returns various measures of the PSF shape in addition to the KL decomposition, including parameters of the best-fit double-Gaussian, evaluated at the center of each frame. These are the psf2GSigma parameters in the Field class. The psfWidth is is the effective width also determined at the center of each frame. It is a good generic number to quote for the seeing on each frame. Figure 4.3 shows the psfWidth in r for each CCD column in each run of the EDR imaging data. Improvements in telescope collimation and thermal environment since these data were taken have given rise to substantially better seeing.
The psp calculates a PSF aperture correction for each field. We did not fully test this correction and did not properly apply it to the EDR. The quantity apCorrRunErr is the median value of the difference between psfMag and the aperture magnitude measured with a 7.43\arcsec radius aperture over the bright stars in the frame, and is a measure of the limitations of our KL decomposition of the PSF. This quantity is Gaussian-distributed with sapprox. 0.03 mag, but apCorrRunErr can be as large as 0.1 mag in regions that the PSF is changing rapidly (FWHM changing by > 10% on a single frame); adding it directly to the psfMag in the EDR improves the overall PSF photometry accordingly. This correction will be applied to the data correctly in future data releases of the SDSS data. apCorrRunErr is shown for each run in Figure 4.3, and we provide a table of these corrections on our web site.
The frames pipeline (frames) detects, deblends, and measures objects, carrying out this processing on a field-by-field basis. This section describes what one needs to know to interpret the quantities we calculate in frames, rather than the technical details, which will be covered in Lupton et al. [2001b]. These quantities are stored in the PhotoObj and Profile classes (Table ). Mask bits set in objFlags of PhotoObj are for the object detections for all bands combined. Mask bits set in flags of PhotoObj are for the detection in each band. The mask bit values are given in Table . We refer to these classes and masks throughout this section. Further products of the imaging pipelines are given in § 2.2.
Each object detected during the frames analysis of a particular set of data is given a unique identifier which consists of five integers:
Several different versions of the pipelines were run on the imaging data for the EDR (Table ). The rerun number for each run distinguishes a set of pipeline versions. Two rerun numbers are important for the EDR. The first rerun was used to select targets for spectroscopic observation. The second rerun used the most recent versions of the pipelines, and the results of this processing are distributed. Between reruns of frames, the run, camCol and fieldID of detected objects do not change, because frames acts on individual frames. However, the objID does change. Future data releases will most likely be made with yet another rerun which uses the most current version of our pipelines and calibrations.
In addition, data quality flags are set for each field, after we evaluate the processing from an entire run. These are discussed below in section 4.6.
The pipeline analyzes the data one field (§ 3.5) at a time. Because information about each object is contained in five separate frames, one for each filter, the five frames for each field are processed together.
Each frame has instrumental signatures (flat field, bias, cosmic rays, and bad columns) removed, and the global sky value from the psp subtracted. The CCDs are known to be non-linear by of order 2% near saturation; this is not corrected for in the current version of the pipeline. Cosmic rays are found as objects with gradients between adjacent pixels substantially steeper than allowed by the PSF and are interpolated over. Note that our images are marginally well-sampled in 1¢¢ seeing. Previously recognized bad columns are interpolated over using linear prediction (e.g., Rybicki & Press 1992), as are bleed trails from saturated stars.
Objects containing a (found and interpolated-over) cosmic ray are flagged by having the mask bit CR set in flags for that band; objects with any interpolated pixels in them at all (due to bad columns or bleed trails) are flagged by having INTERP set. INTERP_CENTER indicates that a pixel was interpolated over within 3 pixels of the center of the object, and PSF_FLUX_INTERP indicates that at least 20% of the PSF flux is interpolated; in rare cases, photometry of objects with these latter two flags set might be suspect.
Next, objects in the frame are detected and their properties measured in a four step process in each band. First, an object finder is run to detect bright objects. In each band, the object-finder detects pixels which are more than 200s (corresponding roughly to r* = 17.5) above the sky noise; only a single pixel need be over this threshold for an object to be detected at this stage. These objects are flagged as BRIGHT. The extended power-law wings of BRIGHT objects which are saturated are subtracted from the frame. Such stars are marked SUBTRACTED. Then, the sky level is estimated by median-smoothing the frame image on a scale of approximately 100 arcsec; the resulting ``local'' sky image is subtracted from the frame (remember that a global sky determined on an entire frame has already been subtracted). This sky level is stored for each object for each band in the parameter sky (and an associated uncertainty skyErr), in units of asinh magnitudes (§ 4.4.5) per square arcsec. For children of blends (§ 4.4.3), the sky parameter includes the contribution of its siblings.
Third, we find objects by smoothing the image with a Gaussian fit to the PSF, and look for 5s peaks over the (smoothed) sky in each band. After objects are detected, they are ``grown'' more-or-less isotropically by an amount approximately equal to the radius of the seeing disk. We then define an object to be a connected set of pixels which are detected in at least one band. Note that all pixels in the object are subsequently used in the analysis in every band, whether or not they were originally detected in that band. The photometric pipeline never reports an upper limit for the detection of an object, but rather carries out a proper measurement, with its error, for each of the varieties of fluxes listed in § 4.4.5.
Objects detected in a given band at this stage are flagged by setting the mask bit BINNED1 (Table ) in flags of the PhotoObj class in that band. All pixel values in these BINNED1 objects are then replaced by the background level (with sky noise added in), the frame is rebinned into a 2×2 image, and the object-finder is run again. The resulting sample is flagged in a similar way with the BINNED2 mask, and pixel values in these objects are replaced with the background level. Finally, we rebin the original pixel data at 4×4, and objects found at this stage are flagged BINNED4. The set of detected objects then consists of all objects with pixels flagged BINNED1, BINNED2, or BINNED4.
Fourth, the pipeline measures the properties of each object, including the position, as well as several measures of flux and shape, described more fully below. It attempts to determine whether each object actually consists of more than one object projected on the sky, and if so, to deblend such a ``parent'' object into its constituent ``children'', self-consistently across the bands (thus all children have measurements in all bands). Then it again measures the properties of these individual children. Bright objects are measured twice: once with a global sky, and no deblending run; this detection is flagged BRIGHT, and a second time with a local sky. For most purposes, only the latter is useful, and thus one should reject all objects flagged BRIGHT in compiling a sample of objects for study.
Other flag bits listed in Table that are useful at this stage are:
Finally, the pipeline outputs the measured quantities for each object, including all of the BRIGHT objects, all the parents, and all the children. In the following sections, we discuss how to interpret these outputs.
A typical frame at high Galactic latitudes contains of order 1000 objects, including of order 5 objects chosen deliberately in regions where no objects are detected; these are used to place sky fibers for spectroscopy, and are classified as Sky. Repeat observations of a given area of sky in roughly 1.5¢¢ seeing shows that our 95% completeness limit for stars is u*=22.0, g*=22.2, r*=22.2, i* = 21.3, and z* = 20.5; the completeness drops to zero over roughly half a magnitude. These numbers are somewhat worse than quoted in the SDSS Project Book [York et al., 2000], as that calculation assumed 1¢¢ seeing and slightly broader filters.
Once objects are detected, they are deblended by identifying individual peaks within each object, merging the list of peaks across bands, and adaptively determining the profile of images associated with each peak, which sum to form the original image in each band. The originally detected object is referred to as the ``parent'' object and has the flag BLENDED set if multiple peaks are detected; the final set of sub-images of which the parent consists are referred to as the ``children'' and have the flag CHILD set. Note that all quantities in class PhotoObj (Table ) are measured for both parent and child. For each child, parent gives the id of the parent (for parents themselves or isolated objects, this is set to the id of the BRIGHT counterpart if that exists; otherwise it is set to -1); for each parent, nchild gives the number of children an object has. Children are assigned the id numbers immediately after the id number of the parent. Thus, if object with id 23 is set as BLENDED and has nchild equal to 2, objects 24 and 25 will be set as CHILD and have parent equal to 23.
The list of peaks in the parent is trimmed to combine peaks (from different bands) that are too close to each other (if this happens, the flag PEAKS_TOO_CLOSE is set in the parent). If there are more than 25 peaks, only the most significant are kept, and the flag DEBLEND_TOO_MANY_PEAKS is set in the parent.
In a number of situations, the deblender decides not to process a BLENDED object; in this case the object is flagged as NODEBLEND. Most objects with EDGE set are not deblended. The exceptions are when the object is big enough (larger than roughly an arcminute) that it will most likely not be completely included in the adjacent scanline either; in this case, DEBLENDED_AT_EDGE is set, and the deblender gives it its best shot. When an object is larger than half a frame, the deblender also gives up, and the object is flagged as TOO_LARGE. Other intricacies of the deblending results are recorded in flags listed in Table ; see Lupton et al. [2001b] for a complete description.
On average, about 15-20% of all detected objects are blended, and many of these are superpositions of galaxies which the deblender successfully treats by separating the images of the nearby objects. Thus, usually it is the childless (not BLENDED) objects which are of most interest for science applications. However, the versions of the pipelines used for the EDR occasionally deblends complex galaxies with large angular size, such as nearby face-on spiral galaxies, in a way which the human eye would tend not to. Thus, some care is required in the analysis of samples of bright and/or large objects in the survey. Later versions of the deblender handle these cases much more gracefully, and future data releases will incorporate these improvements.
Main-belt asteroids have a proper motion of several arcsec in the roughly 5 minutes it takes for an object to cross the imaging camera. This means that they will have a different centroid in the different photometric bands. If not taken into account, this could mean that they would be deblended into separate objects of unusual color, playing havoc with the target selection algorithms (§ 4.8). Thus the deblender checks every object for consistency with uniform proper motion between the filters. In the PhotoObj class (Table ) the quantities colv and rowv (and their associated errors), give the resulting proper motion (along the columns and rows of the CCDs respectively) in units of degrees per day. The PSF photometry in each band is done on the object center, taking the motion into account, and therefore is properly measured [Ivezi\'c et al., 2001].
Flag bit values listed in Table describe this processing. The MOVED flag indicates that the deblender considered whether to deblend the object as moving; it is not very useful. If the deblender actually deblended the object as moving, the flag DEBLENDED_AS_MOVING is set; otherwise the flag NODEBLEND_MOVING is set. Note that an object can have a statistically significant motion without being deblended as such if the motion is small enough that the photometry would be fine without taking it into account. An object whose motion is not statistically significant is flagged STATIONARY, while an object whose motion is inconsistent with a straight line is flagged BAD_MOVING_FIT.
We have discussed how frames detects, deblends, and designates objects. This section and the next discuss the measurements applied to each resulting object. Each of the quantities described here has an associated estimated error measured as well, unless otherwise mentioned. In this subsection, we discuss the various measurements made of the flux in each object.
We begin by describing the magnitude scale which the SDSS uses. Unless otherwise specified (the most important exceptions being petroMag and modelMag, to get self-consistent colors), the measures discussed here are applied independently in each band pass. Magnitudes within the SDSS are expressed as inverse hyperbolic sine (or ``asinh'') magnitudes, described in detail by Lupton, Gunn, & Szalay (1999). The transformation from linear flux measurements to asinh magnitudes is designed to be virtually identical to the standard astronomical magnitude [Pogson, 1856] at high signal-to-noise ratio, but to behave reasonably at low signal-to-noise ratio and even at negative values of flux, where the logarithm in the Pogson magnitude fails. This allows us to measure a flux even in the absence of a formal detection; we quote no upper limits in our photometry.
The asinh magnitudes are characterized by a softening parameter b,
the typical 1 s noise of the sky in a PSF aperture in 1'' seeing. The relation
between detected flux f and asinh magnitude m is (see equation (3) of ):
| (7) |
For isolated stars, which are well-described by the PSF, the optimal measure of the total flux is determined by fitting a PSF model to the object. In practice, we do this by sinc-shifting the image of a star so that it is exactly centered on a pixel, and then fitting a Gaussian model of the PSF to it. This fit is carried out on the local PSF KL model (§ 4.3) at each position as well; the difference between the two is then a local aperture correction, which gives a corrected PSF magnitude. Finally, we use bright stars to determine a further aperture correction to a radius of 7.4¢¢ as a function of seeing, and apply this to each frame for its seeing. This involved procedure is necessary to take into account the full variation of the PSF (measured in the psp described above) across the field, including the low signal-to-noise ratio wings. Empirically, this reduces the seeing-dependence of the photometry to below 0.02 mag for seeing as poor as 2¢¢. The resulting magnitude is stored in the quantity psfMag. As mentioned above, the flag PSF_FLUX_INTERP warns that the PSF photometry might be suspect. The flag BAD_COUNTS_ERROR warns that because of interpolated pixels, the error may be under-estimated.
The PSF errors include contributions from photon statistics and uncertainties in the PSF model and aperture correction, although they do not include uncertainties in photometric calibration (§ 4.5). Repeat observations show that these errors are probably underestimated by 10-20%.
The flux contained within the aperture of a spectroscopic fiber (3 arcsec in diameter) is calculated in each band and stored in fiberMag. Note that no correction for seeing is applied to this measure of the magnitude. For children of deblended galaxies, some of the pixels within a 1.5 arcsec radius may belong to other children. In this case, the fiber magnitudes can be rather misleading, as they will not reflect the amount of light which the spectrograph will see. For future data releases, we will calculate the true flux within a fiber diameter, including all light from the parent that falls in the aperture centered at the location of the child. We will also correct the detected flux to a fiducial value of the seeing.
For galaxy photometry, measuring flux is more difficult than for stars, because galaxies do not all have the same radial surface brightness profile, and have no sharp edges. In order to avoid biases, we wish to measure a constant fraction of the total light, independent of the position and distance of the object. Please refer to the discussion in Strauss et al. [2001]. To satisfy these requirements, the SDSS has adopted a modified form of the Petrosian [1976] system, measuring galaxy fluxes within a circular aperture whose radius is defined by the shape of the azimuthally averaged light profile.
We define the ``Petrosian ratio'' RP at a radius r from
the center of an object to be the ratio of the local surface
brightness in an annulus at r to the mean surface brightness within
r, as described by Blanton et al. [2001a],Yasuda et al. [2001],Strauss et al. [2001]:
| (8) |
The Petrosian radius rP is defined as the radius at which
RP(rP) equals some specified value RP,lim, set
to 0.2 in our case. The
Petrosian flux in any band is then defined as the flux within a
certain number NP (equal to 2.0 in our case) of r Petrosian radii:
| (9) |
The aperture 2 rP is large enough to contain nearly all of the flux for typical galaxy profiles, but small enough that the sky noise in FP is small. Thus, even substantial errors in rP cause only small errors in the Petrosian flux (typical statistical errors near the spectroscopic flux limit of r* approx. 17.7 are < 5%), although these errors are correlated.
The Petrosian radius in each band is the parameter petroRad, and the Petrosian magnitude in each band (calculated, remember, using only petroRad for the r band) is the parameter petroMag.
In practice, there are a number of complications associated with this definition, because noise, substructure, and the finite size of objects can cause objects to have no Petrosian radius, or more than one. Those with more than one are flagged as MANYPETRO; the largest one is used. Those with none have NOPETRO set. Most commonly, these objects are faint (r* > 20.5 or so); the Petrosian ratio becomes unmeasurable before dropping to the limiting value of 0.2; these have PETROFAINT set and have their ``Petrosian radii'' set to the default value of the larger of 3 arcsec or the outermost measured point in the radial profile. Finally, a galaxy with a bright stellar nucleus, such as a Seyfert galaxy, can have a Petrosian radius set by the nucleus alone; in this case, the Petrosian flux misses most of the extended light of the object. This happens quite rarely, but one dramatic example in the EDR data is the Seyfert galaxy NGC 7603 = Arp 092, at a(2000) = 23:18:56.6, d(2000) = +00:14:38.
How well does the Petrosian magnitude perform as a reliable and complete measure of galaxy flux? Theoretically, the Petrosian magnitudes defined here should recover essentially all of the flux of an exponential galaxy profile and about 80% of the flux for a de Vaucouleurs profile. As shown by Blanton et al. [2001a], this fraction is fairly constant with axis ratio, while as galaxies become smaller (due to worse seeing or greater distance) the fraction of light recovered becomes closer to that fraction measured for a typical PSF, about 95% in the case of the SDSS. This implies that the fraction of flux measured for exponential profiles decreases while the fraction of flux measured for de Vaucouleurs profiles increases as a function of distance. However, for galaxies in the spectroscopic sample (r* < 17.7), these effects are small; the Petrosian radius measured by frames is extraordinarily constant in physical size as a function of redshift ().
Just as the PSF magnitudes are optimal measures of the fluxes of
stars, the
optimal measure of the flux of a galaxy would use a matched galaxy
model. With this in mind, the code fits two models to the
two-dimensional image of each object in each band; a pure
de Vaucouleurs profile,
| (10) |
| (11) |
At bright magnitudes (r* < 18), the model magnitudes are a poor measure of the total flux of the galaxy, due to the fact that the fits are restricted to the central parts of objects [Strateva et al., 2001]. This issue will be addressed in future data releases.
These fitting procedures yield the quantities: r_deV and r_exp, the effective radii of the models; ab_deV and ab_exp, the axis ratio of the best fit models; and phi_deV and phi_exp, the position angles of the ellipticity (in degrees East of North). Note that these quantities correctly model the effects of the PSF. Also recorded are the likelihoods associated with each model from the c2 fit, deV_L and exp_L, and the total magnitudes associated with each fit, deVMag and expMag. Errors for each of these quantities (which are based only on photon statistics) are also reported. We apply aperture corrections to make these model magnitudes equal the PSF magnitudes in the case of an unresolved object.
In order to measure unbiased colors of galaxies, we measure their flux through equivalent apertures in all bands. We choose the model (exponential or de Vaucouleurs) of higher likelihood in the r filter, and apply that model (i.e., allowing only the amplitude to vary) in the other bands after convolving with the appropriate PSF in each band. The resulting magnitudes are termed modelMag. The resulting estimate of galaxy color will be unbiased in the absence of color gradients. Systematic differences from Petrosian colors are in fact often seen due to color gradients, in which case the concept of a global galaxy color is somewhat ambiguous. For faint galaxies, the model colors have appreciably higher signal-to-noise ratio than do the Petrosian colors.
Due to the way in which model fits are carried out, there is some weak discretization of model parameters, especially r_exp and r_deV. Also note that very occasionally, the quoted axis ratios are given negative values. These two problems will be addressed in subsequent data releases.
Reddening corrections in magnitudes at the position of each object, reddening, are computed following Schlegel, Finkbeiner, & Davis [1998]. These corrections are not applied to the magnitudes in the databases. Conversions from E(B-V) to total extinction Al, assuming a z=0 elliptical galaxy spectral energy distribution, are tabulated in Table .
Faced with this array of different magnitude measurements to choose from, which one is appropriate in which circumstances? We cannot give any guarantees of what is appropriate for the science you want to do, but here we present some examples, where we use the general guideline that one usually wants to maximize some combination of signal-to-noise ratio, fraction of the total flux included, and freedom from systematic variations with observing conditions and distance:
Finally, we note that azimuthally-averaged radial profiles are provided, as described below, and can easily be used to create circular aperture magnitudes of any desired type. For instance, to study a large dynamic range of galaxy fluxes, one could measure alternative Petrosian magnitudes with parameters tuned such that the Petrosian flux includes a small fraction of the total flux, but yields higher signal-to-noise ratio measurements at faint magnitudes.
The frames pipeline also provides several characterizations of the shape and morphology of an object.
The frames pipeline provides a simple star/galaxy separator in its type
parameters (provided separately for each band) and its
objc_type parameters (one value per object); these are set to 3
(galaxy) or 6 (star).
In particular, Lupton et al. [2001a] show that the following simple cut
works at the 95% confidence level for our data to r* = 21 and even
somewhat fainter:
| (12) |
Experimentation has showed that simple variants on this scheme, such as defining galaxies as those objects classified as such in any two of the three high signal-to-noise ratio bands (namely, g, r, and i), work better in some circumstances. This scheme occasionally fails to distinguish pairs of stars with separation small enough ( < 2¢¢) that the deblender does not split them; it also occasionally classifies Seyfert galaxies with particularly bright nuclei as stars.
Further information to refine the star-galaxy separation further may be used, depending on scientific application. For example, Scranton et al. [2001] advocate applying a Bayesian prior to the above difference between the PSF and exponential magnitudes, depending on seeing and using prior knowledge about the counts of galaxies and stars with magnitude.
The frames pipeline extracts an azimuthally-averaged radial surface brightness profile. In the catalogs, it is given as the average surface brightness in a series of annuli (whose outer radii are listed in Table ). This quantity is in units of ``maggies'' per square arcsec, where a maggie is a linear measure of flux; one maggie has an AB magnitude of 0 (thus a surface brightness of 20 mag/square arcsec corresponds to 10-8 maggies per square arcsec). In the class Profile (Table ), the number of annuli for which there is a measurable signal is listed as nprof, the mean surface brightness is listed as profMean, and the error is listed as profErr. This error includes both photon noise, and the small-scale ``bumpiness'' in the counts as a function of azimuthal angle.
When converting the profMean values to a local surface brightness, it is not the best approach to assign the mean surface brightness to some radius within the annulus and then linearly interpolate between radial bins. Do not use smoothing splines, as they will not go through the points in the cumulative profile and thus (obviously) will not conserve flux. What frames does, e.g., in determining the Petrosian ratio, is to fit a taut spline to the cumulative profile and then differentiate that spline fit, after transforming both the radii and cumulative profiles with asinh functions. We recommend doing the same here.
The frames pipeline also reports the radii containing 50% and 90% of the Petrosian flux for each band, petroR50 and petroR90 respectively. The usual characterization of surface-brightness in the target selection pipeline of the SDSS is the mean surface brightness within petroR50.
It turns out that the ratio of petroR50 to petroR90, the so-called ``inverse concentration index'', is correlated with morphology [Shimasaku et al., 2001,Strateva et al., 2001]. Galaxies with a de Vaucouleurs profile have an inverse concentration index of around 0.3; exponential galaxies have an inverse concentration index of around 0.43. Thus, this parameter can be used as a simple morphological classifier.
An important caveat when using these quantities is that they are not corrected for seeing. This causes the surface brightness to be underestimated, and the inverse concentration index to be overestimated, for objects of size comparable to the PSF. The amplitudes of these effects, however, are not yet well characterized.
In addition to the model and PSF magnitudes described above,
the likelihoods deV_L, exp_L, and star_L are also
calculated by frames. These are the probabilities of achieving the
measured c2 for the de Vaucouleurs, exponential, and PSF fits,
respectively. If one wishes to make use of this trinary scheme to
classify objects, calculation of the fractional likelihoods is recommended:
| (13) |
The model fits yield an estimate of the axis ratio and position angle of each object, but it is useful to have model-independent measures of ellipticity. In the data released here, frames provides two further measures of ellipticity, one based on second moments, the other based on the ellipticity of a particular isophote. The model fits do correctly account for the effect of the seeing, while the methods presented here do not.
The first method measures flux-weighted second moments,
defined as:
|
|
A second measure of ellipticity is given by measuring the ellipticity of the 25 magnitudes per square arcsecond isophote (in all bands). In detail, frames measures the radius of a particular isophote as a function of angle and Fourier expands this function. It then extracts from the coefficients the centroid, major and minor axis, position angle, and average radius of the isophote in question12. It also reports the derivative of each of these quantities with respect to isophote level, necessary to recompute these quantities if the photometric calibration changes13.
Throughout this section, we have listed various known problems with the processing, many of which will be fixed in future releases of the data. We now list problems that were not otherwise mentioned earlier.