DESI Spectroscopic Data Release
- Dark Energy Spectroscopic Instrument (DESI) data releases are comprehensive spectroscopic datasets with calibrated flux, wavelength, and redshift measurements for precision astrophysical studies.
- Over 1.7 million high-quality spectra from stars, galaxies, and quasars are processed using a robust pipeline ensuring <0.03 Å wavelength accuracy and >99% classification purity.
- The release supports diverse applications including Lyman-α forest analyses, Milky Way archaeology, and large-scale structure studies with open public access via modern data tools.
The Dark Energy Spectroscopic Instrument (DESI) data releases comprise well-documented, multi-faceted spectroscopic datasets enabling precision cosmology, stellar and galaxy evolution, and Milky Way archaeology. DESI’s modular data products and pipeline outputs facilitate a broad range of analyses, all delivered with rigorously quantified performance and public accessibility.
1. Data Release Scope and Survey Overview
The DESI Early Data Release (EDR) encompasses spectra acquired during a five-month Survey Validation phase (December 2020–May 2021), spanning commissioning, target validation, operational development, and the One-Percent Survey, supplemented by secondary programs and special calibration tiles. EDR contains spectra for:
| Class | Good Unique Spectra |
|---|---|
| Milky Way Survey (Star) | 466,447 |
| Bright Galaxy Survey (BGS) | 428,758 |
| Luminous Red Galaxy (LRG) | 227,318 |
| Emission Line Galaxy (ELG) | 437,664 |
| Quasar (QSO) | 76,079 |
| Secondary Programs | 137,148 |
Total: 1,712,004 good spectra; including stars (496,128), galaxies (1,125,635), and quasars (90,241) (Collaboration et al., 2023).
The release includes flux- and wavelength-calibrated 1D spectra, intermediate and raw products, redshift and classification catalogs, and value-added catalogues across the Milky Way, galaxy, and quasar target classes (Collaboration et al., 2023, Koposov et al., 8 Jul 2024, Koposov et al., 20 May 2025).
2. Spectroscopic Data Pipeline and Calibration
The DESI spectroscopic pipeline processes CCD exposures from 10 spectrographs (30 cameras), each spanning blue (3600–5930 Å, R≈2000–3200), red (5600–7720 Å, R≈3200–4100), and NIR-Z (7470–9800 Å, R≈4100–5100) arms (Guy et al., 2022).
Pipeline stages include:
- CCD Preprocessing: bias, gain, cosmic-ray masking.
- Wavelength Calibration: arc-lamp exposures fit with Legendre polynomials; barycentric correction.
- Flux Calibration: F-type standard stars yield throughput vectors , applied to all target spectra. PSF corrections accommodate fiber aperture and seeing.
- Sky Subtraction: 10% fibers allocated to blank sky; high-res sky models fit and subtracted iteratively, including 2D spatial polynomial gradients.
- Extraction: 2D PSFs modeled in Gauss–Hermite basis, “spectro-perfectionism” yields decorrelated 1D spectra with resolution matrix.
- Redshift and Classification: PCA-based template fits via Redrock, emission-line afterburners, and custom fitting (QuasarNET, Mg II).
Quality assurance monitors PSF stability (day–night drift <0.3 pixels), wavelength accuracy (<0.03 Å per arm), and radial velocity scatter (≤0.8 km s⁻¹), with overall catalog efficiency and purity >99% for all major target classes (Guy et al., 2022).
3. Lyman-α Forest Data Products and Methodology
The DESI EDR Lyman-α forest catalog leverages 88,511 quasar spectra over SV and M2, corresponding to redshift 2.1 ≤ z ≤ 3.79 and ~0.5 Gpc³ survey volume (Ramírez-Pérez et al., 2023). Each spectrum is coadded on a linear grid (Δλ = 0.8 Å).
Pipeline workflow:
- Masking: cosmic rays (0.1%), CCD defects, DLA regions (CNN+GP), BALs (16% of EDR+M2), and sky/ISM lines.
- Flux Recalibration: calibration wiggles corrected in emission-line–free regions (CIII 1600–1850 Å); DESI requires only a single recalibration step.
- Continuum Fitting: iterative (5×) determination of mean flux using both universal template and per-forest slopes; pixel variance modeled as .
- Weighting: diagonal weights with empirically optimized modulation ( units of ) reducing auto-correlation error bars by ~25%.
HEALPix-organized VAC FITS files contain observed wavelengths, per-forest metadata, arrays, weights, and continua, supporting direct measurement of 1D/3D Lyα forest correlations and BAO features. DESI achieves >20% error reduction compared to SDSS/eBOSS by weighting optimization alone (Ramírez-Pérez et al., 2023).
4. Quasar Catalogs, Broad Absorption Lines, and Redshift Determination
DESI EDR quasar samples are delivered as flux-calibrated, sky-subtracted spectra, with redshift and classification catalogs. BAL quasars are catalogued (29,985 in EDR+M2, 14.3% of quasars with AI > 0 in CIV), with absorption metrics (Balnicity Index BI, Absorption Index AI) computed from PCA-fitted continua. Automated BAL identification achieves purity ≈99%, completeness ≈45% at median SNR (Filbert et al., 2023).
Redshift corrections for BALs involve masking absorption troughs and refitting templates, yielding mean shifts of km s⁻¹. Catastrophic redshift errors ( km s⁻¹) are reduced from 6.7% to <2% post-masking. Incorporation of accurate BAL redshifts increases usable quasar tracers by ~19% and improves Lyα forest correlation function errors by ≈12% without introducing systematics (Filbert et al., 2023).
5. Stellar Value-Added Catalogues and Galactic Structure Applications
The Milky Way Survey value-added catalogs (VACs) for DESI EDR and DR1 provide atmospheric parameters, radial velocities, and abundances for 400,000 (EDR) and 4 million (DR1) stars, with full epoch-by-epoch radial velocity measurements available for >1 million stars (Koposov et al., 8 Jul 2024, Koposov et al., 20 May 2025).
- Parameter Measurement: Two independent pipelines—RVSpecFit (forward model, PHOENIX neural nets) and FERRE (χ² fitting, Kurucz grids)—yield precisions of:
- ≃ 1 km s⁻¹
- ≃ 0.3 dex
- ≃ 0.15 dex (EDR); dex post-calibration (DR1)
- dex, reliable for
- Calibration: Systematic trends in removed using temperature/gravity-dependent formula:
with , and separate coefficients for giants/dwarfs (Koposov et al., 20 May 2025).
- Catalogue Structure: Main FITS includes RVTAB (velocities, parameters), SPTAB (abundances), FIBERMAP (photometry, astrometry, flags), and Gaia cross-match. Recipe for high-purity samples involves RVS_WARN=0, spectral type STAR, PRIMARY=True, and S/N cuts.
- Galactic Components: Thin/Thick disk, stellar halo mapped; >50,000 stars at distances >10 kpc, >3,000 at >50 kpc, thousands of extremely metal-poor stars (), and spectroscopic members of >4 dwarf galaxies, >13 globular clusters, and >24 streams (Koposov et al., 20 May 2025).
- Scientific Impact: Chemo-dynamic mapping, rare object populations, mass constraints from halo tracers; DESI DR1 increases faint star statistics (G = 17.5–21) by ×10 over prior surveys.
6. Large-Scale Structure Catalogs and Cosmological Utility
DESI releases full and clustering galaxy catalogs (BGS, LRG, ELG, QSO) for large-scale structure (LSS) studies. Completeness and FKP weighting schemes optimize clustering analyses. One-Percent Survey catalogs report completeness values in the 86–98% range across tracers and cover ~170 deg² (Collaboration et al., 2023). Selection cuts combine redshift-fit confidence (Δχ² > 9), [OII] flux significance, and optimized FKP weights.
Correlation function and BAO measurements leverage the improved Lyα pipeline and quasar redshift solutions. DESI EDR+M2 Lyα forests already demonstrate 20% smaller auto-correlation errors than eBOSS DR16 in identical volume, with the full DESI survey poised to deliver sub-percent-level BAO precision (Ramírez-Pérez et al., 2023, Ravoux et al., 2023).
7. Access, Tools, and Future Releases
All DESI data products are publicly accessible via HTTPS (https://data.desi.lbl.gov/public), Globus/NERSC, and database/SQL interfaces (Astro Data Lab, PostgreSQL). File organization uses per-tile and HEALPix directories for efficient queries. Tutorials and APIs (specprodDB, SQLAlchemy) are provided for programmatic access (Collaboration et al., 2023).
Future DESI releases will scale to larger samples ( M stars in DR2, anticipated 2027) and incorporate enhanced coadds, recalibrated pipelines, and expanded secondary target coverage, further improving survey utility for Milky Way structure and precision cosmology (Koposov et al., 20 May 2025).