Cross-match

On this page

The cross-matching process
Cross-match history
sdss_id

Target selection in SDSS-V relies on a large number of parent catalogs, which form the pool from which we select targets that will be observed. Any object observed in SDSS-V’s FPS program belongs to one or more of these parent catalogs. Examples of parent catalogs include Gaia DR2/3, Legacy Survey DR8/10, Pan-STARRS, etc. With few exceptions (e.g., pre-public release eROSITA catalogs), all the catalogs from which we drew sources are publicly available and supported by a publication.

Given the range and scope of the parent catalogs, we expect many of their sources to be in common. This leads to the need for a process of “cross-matching” in which the same physical objects are assigned a unique identifier. The cross-matching process is described in detail in Sánchez-Gallego et al. (in prep.)

The cross-matching process

The cross-matching process begins with the ingestion of all the necessary parent catalogs into a dedicated schema in our internal PostgreSQL targeting database (catalogdb). The result of this process is a list of unique sources, each one assigned an identifier (catalogid), and the corresponding associations between each catalogid and the parent catalogs.

The cross-match adds input catalogs one-by-one to the full catalog, and for each new catalog we perform a cross-match in three phases. In phase 1 we link the new input catalog to the full catalog through external ids contained in the tables. In phase 2, for any remaining unmatched entries in the input catalog, we perform a spatial cross-match using coordinates and proper motions. In phase 3, we ingest the remaining entries that haven’t been cross-matched as new entries to the full catalog. The figure below shows how the cross-match phases work for each table that is ingested.

In phases 1 and 2, catalog sources are matched to entries already present in table catalog, assigned the already existing catalogid, and an entry is added to the table mos_catalog_to_tablename table. Phase 1 uses foreign key associations that may exist between the input catalog and the previously input catalogs.

Phase 2 treats objects with no existing associations and utilizes a spatial cross-match. We perform a cone-search for each new target against the list of already processed sources in mos_catalog. We use a cone-search radius that matches the spatial resolution of the catalog being processed (1 arcsec in most cases, which matches our on-sky fiber size). The result is that multiple sources may be associated with the new target. In these cases the mos_catalog_to_tablename contains entries for all the cone-search matches and the closest match is marked with best=True.

Phase 3 treats objects with no existing associations and no successful cross-match, and adds such objects as new entries in mos_catalog. Each entry (from, for example, “tableN”) that is ingested in phase 3 is added to the mos_catalog table with a unique identifier called catalogid and with parameter lead set to “tableN”, which means that the coordinates, proper motions, and parallax value (if available) are taken from tableN. An entry is also added to a join table, that in this case would be called mos_catalog_to_tableN, that relates objects in a specific table to their counterparts in the mos_catalog table.

Flowchart of how the cross-match process works. In phase 1 tables are linked using foreign key columns containing ids from other table, in phase 2 stars are matched spatially using a distance cutoff after correcting for proper motions, and in phase 3 we ingest the unmatched entries to the output `catalog` table. In this example the first table to be ingested was TIC-v8, but that changes with different versions of the cross-match process. Credit: José Sánchez-Gallego

The cross-match process is sensitive to the order in which the parent catalogs are processed, as well as to the cone-search radius used during the spatial cross-match phase. In general we process all-sky, high spatial resolution catalogs first (TIC v8, Gaia DR2/3, Legacy Survey DR8/10), with lower resolution catalogs being processed later.

Cross-match history

The first cross-match, internally labeled 0.1.0, was performed for the SDSS-V plate program, and created the first version of the SDSS mos_catalog table.

For the first robotic FPS operations, a second cross-match was performed, labeled 0.5.0. This cross-match added additional catalogs needed for FPS targeting. We released this version of the cross-match in DR18.

In 2023, we performed a final 1.0.0 cross-match for SDSS-V, which will be released in a future data release, and incorporates new catalogs such as Gaia DR3.

For DR19, both the 0.1.0 and 0.5.0 cross-matches exist as separate sets of entries in the mos_catalog table. We indicate the cross-match for each entry with the version_id, which can be joined with mos_catalogdb_version to obtain the corresponding plan name (version_id 21 joins to 0.1.0 and version_id 25 joins to 0.5.0).

With multiple cross-matches, it becomes necessary to associate objects across them. We created sdss_id for this, described below.

sdss_id

Because our targeting needs have changed during SDSS-V, as described above we have re-run the cross-match process several times for different sets of parent catalogs. By design, each cross-match process doesn’t “know” about previous cross-matches or the catalogids assigned in them. As a result, a given physical source will be assigned a new catalogid in each version of the cross-match.

To address this issue we perform an additional step in which an overall unique identifier, sdss_id, is assigned to each physical source and linked to all the catalogids associated with that source. The details of the sdss_id process are described here.