Papers
Topics
Authors
Recent
2000 character limit reached

Clustering Redshifts Technique

Updated 29 November 2025
  • Clustering redshifts are an observational method that reconstructs redshift distributions by cross-correlating photometric samples with spectroscopic references, bypassing traditional photo-z limitations.
  • The technique employs angular cross-correlation functions with optimized scale selection to mitigate non-linear biases, achieving sub-percent precision in tomographic redshift calibration.
  • Robust pipelines integrate simulation-driven mock catalogs, bias corrections, and inversion modelling to support high-precision surveys like Euclid, LSST, and DESI.

Clustering redshifts, often termed "clustering-z" or "clustering-based redshift inference," are an observational methodology to reconstruct redshift distributions of extragalactic sources using spatial cross-correlations with reference samples of known redshift. This approach bypasses reliance on photometric redshift estimators, template SED assumptions, or training-set coverage, instead leveraging the well-established principle that only populations overlapping in redshift will exhibit non-zero angular/matched-field clustering. Clustering-redshift techniques are now critical for the calibration of cosmological survey tomographic bins, with demonstrated sub-percent precision and robust performance for next-generation experiments such as Euclid, LSST, and DESI.

1. Theoretical Formalism

Given a photometric ("unknown") sample pp with angular positions but unknown redshifts, and a spectroscopic ("reference") sample ss with secure redshifts sliced into narrow bins centered at ziz_i, the observable is the angular cross-correlation function,

wps(θ)Δp(n^)Δs(n^+θ)w_{ps}(\theta) \equiv \langle \Delta_{p}(\hat{n})\, \Delta_{s}(\hat{n}+\theta) \rangle

where Δx(n^)=(Nx(n^)Nˉx)/Nˉx\Delta_{x}(\hat{n}) = (N_{x}(\hat{n}) - \bar{N}_{x}) / \bar{N}_{x}.

Under the Limber and flat-sky approximations, the cross-correlation can be expressed as

wps(θ)=dzbp(z)np(z)bs(z)ns(z)ξm(r=χ(z)θ;z)w_{ps}(\theta) = \int dz\, b_{p}(z) n_{p}(z) b_{s}(z) n_{s}(z) \, \xi_{m}(r_{\perp} = \chi(z)\theta; z)

where

  • np(z)n_{p}(z) and ns(z)n_{s}(z) are the redshift distributions,
  • bp(z),bs(z)b_{p}(z), b_{s}(z) are scale-averaged galaxy biases,
  • ξm(r,z)\xi_{m}(r, z) is the matter correlation function,
  • χ(z)\chi(z) is the comoving distance.

For a spectroscopic slice narrow in redshift, ns(z)δD(zzi)n_{s}(z) \approx \delta_D(z - z_i), leading to

wps(θ;zi)bp(zi)bs(zi)np(zi)ξm(r;zi)w_{ps}(\theta; z_i) \approx b_{p}(z_i)\, b_{s}(z_i)\, n_{p}(z_i)\, \xi_{m}(r_{\perp}; z_i)

and thus

np(zi)wps(θ;zi)bp(zi)bs(zi)ξm(r;zi)n_{p}(z_i) \propto \frac{w_{ps}(\theta; z_i)}{b_{p}(z_i) b_{s}(z_i) \xi_{m}(r_{\perp}; z_i)}

The estimator is typically implemented using the Landy–Szalay formula applied to data–data, data–random, and random–random pairs. To maximize S/N, measurements are integrated over an annulus in projected comoving separation, with weighting W(r)r1W(r_{\perp}) \propto r_{\perp}^{-1} or similar.

2. Calibrated Clustering-Redshift Pipeline

The pipeline consists of several key steps:

  • Mock Catalog Generation: Simulations such as Flagship2 are used to construct both the photometric sample (e.g., iE<24.5i_E < 24.5 for Euclid) and spectroscopic tracers (BOSS-like, DESI-like, Euclid NISP–S) coherently embedded in the same large-scale structure.
  • Tomographic and Spectroscopic Binning: The photometric sample is split into NN photo-z bins (for example, 10 bins with uniform nn over 0.2<zp<1.60.2 < z_p < 1.6). Each is cross-correlated with spectroscopic slices of width Δz=0.05\Delta z = 0.05 in true-z.
  • Angular Correlation Measurement and Small-Scale Cuts: Cross-correlations are measured over comoving projected radii, e.g., 0.5–10 Mpc. Scales below 1.5 Mpc are excluded to mitigate non-linear 1-halo contributions, which manifest as deviations in the correlation coefficient rps(r;z)r_{ps}(r_{\perp}; z) from unity.
  • Photometric Sample Bias Measurement (M3 Method): Each photo-z bin is subdivided into broader slices (Δzp=0.1\Delta z_p = 0.1), within which bp(z)b_p(z) is assumed constant. The photometric auto-correlation is measured and corrected using Limber-integrated predictions for the matter correlation (see Eq. 26 in (Doumerg et al., 15 May 2025)), and a low-order polynomial is fit to interpolate bp(z)b_p(z) across the bin. This controls systematic uncertainties in the photometric bias to ≤1% per bin.
  • Normalization and Inversion: The recovered np(zi)n_p(z_i) are normalized such that np(z)dz\int n_p(z) dz matches the total sample or set to unity. Both parametric (shift/stretch) and non-parametric (Gaussian Process with suppression) models are fit to np(z)n_p(z) to quantify mean and width with uncertainties.

3. Achieved Statistical and Systematic Precision

On application to realistic survey mocks, the clustering-redshift pipeline achieves:

  • For zp<1.6z_p < 1.6, the mean redshift in each tomographic bin is constrained to

σ(z)0.002  (1+z)\sigma(\langle z \rangle) \lesssim 0.002\;(1+z)

meeting or exceeding the stringent calibration requirements for Stage-IV lensing and BAO analyses.

  • The fractional uncertainty in the standard deviation of each n(z)n(z) is σ(σz)/σz<0.1\sigma(\sigma_z)/\sigma_z < 0.1.
  • Systematic biases are dominated by:
    • 1-Halo Effects: Below 1.5 Mpc, satellite-central galaxy pairs introduce non-linear bias. Excluding these scales restores rps1r_{ps} \approx 1.
    • “m-bin” (Dirac-slice) Approximation: Neglect of neighbor slices introduces an offset of order 0.2–1% in np(z)n_p(z). Matrix correction schemes can further minimize this effect.
    • Magnification and RSD: Magnification induces a shift in z0.0005(1+z)\langle z \rangle \lesssim 0.0005\,(1+z) for Δz=0.05\Delta z=0.05; RSD have smaller impact in Euclid-like bins.

4. Galaxy Bias: Degeneracies, Mitigation, and Perspectives

A central limitation of clustering-redshift techniques is the perfect degeneracy between np(z)n_p(z) and the photometric galaxy bias bp(z)b_p(z), as the cross-correlation amplitude scales as bpbsb_p\, b_s. In the pipeline:

  • bs(z)b_s(z) is measured directly from the spectroscopic sample’s auto-correlation.
  • bp(z)b_p(z) is estimated within each sub-bin via clustering auto-correlations, assuming mild redshift evolution. Higher-order corrections, including a 3-bin bias matrix formulation, are under development.

Uncertainty in bp(z)b_p(z) propagates linearly to the normalized np(z)n_p(z) and the mean redshift z\langle z \rangle, making it the dominant systematic in most regimes. Mitigation relies on fine tomographic slicing, external bias constraints, and direct measurement from the survey itself.

5. Systematics, Validation, and Best Practices

Robust calibration requires careful treatment of several systematics:

  • Scale Selection: Exclusion of small, non-linear (1-halo) scales ensures unbiased large-scale clustering.
  • Spectroscopic Tracer Coverage: Sufficient sky density (ns105n_s \gtrsim 10^{-5} arcmin2^{-2} per δz=0.01\delta z = 0.01) is required for sub-percent mean-redshift errors over wide areas (Scottez et al., 2017).
  • Tomographic Slicing: Fine subdivisions (by photo-z or color) reduce bias variation within each bin.
  • Survey Masks and Systematics: Survey masks, random catalogs, and corrections for angular selection function and completeness are essential for unbiased estimators.
  • Model Fitting: Both parametric (shift/stretch) and non-parametric (e.g., GP) models should be fit to the measured np(z)n_p(z) to avoid negative and noisy solutions and to provide robust mean/width extraction.

End-to-end validation on mocks and comparison against available spectroscopic samples are essential to demonstrate error control at the required level.

6. Practical Applications and Extensions

  • Euclid, LSST, DESI: Clustering-redshift pipelines have been shown to calibrate tomographic redshift bins to within σ(z)<0.002(1+z)\sigma(\langle z \rangle) < 0.002 (1+z) (Doumerg et al., 15 May 2025, Naidoo et al., 2022).
  • Cosmological Constraints: Cluster-z calibrated n(z)n(z) distributions feed into weak lensing, galaxy-galaxy lensing, and large-scale structure analyses, restoring much of the constraining power of spectroscopic surveys (Kovetz et al., 2016).
  • Beyond 1.6 in zz: Extension to higher redshift will rely on QSO and Lyman-break galaxy tracers (e.g., DESI/eBOSS/4MOST–WST, LSST Deep) (Doumerg et al., 15 May 2025).
  • Joint Approaches: Hybrid inference combining clustering-z, self-calibration (auto/cross within photo-z), and photometric information further improves accuracy and robustness (Zheng et al., 18 Sep 2024, Sánchez et al., 2018).

7. Future Improvements and Research Directions

Advances underway include:

  • Higher-Order Corrections: Implementing full neighbor-slice (“m-bin”) matrix corrections to eliminate residual binning-induced bias.
  • Explicit Magnification/RSD Modeling: Measuring magnification slopes and including them in the estimator (e.g., α=2.5sμ1\alpha = 2.5s_\mu-1); joint modeling with cosmological lensing and redshift-space distortions.
  • Covariance Modeling: Coherent modeling of full covariance matrices over large angular scales for robust cosmological parameter propagation.
  • Multi-Tracer Diagnostics: Simultaneous cross-correlation with multiple spectroscopic populations to detect small-scale conformity and tracer-dependent bias effects.
  • Individual-Galaxy PDFs: Employing clustering-z in color-magnitude–space cells for object-level p(z)p(z) inference; hybrid schemes with machine learning and hierarchical Bayesian inference (Morrison et al., 2016, Sánchez et al., 2018).
  • Deeper and Fainter Samples: Adapting the methodology for surveys with unmatched photometric/spectroscopic depth, including radio, NIR, and dropout-selected samples (Rahman et al., 2015, Rahman et al., 2015).

Clustering-redshift calibration, with rigorous bias control and end-to-end simulation validation, forms a cornerstone of high-precision calibration for cosmological large-scale structure and lensing experiments in the coming decade (Doumerg et al., 15 May 2025, Naidoo et al., 2022, Cawthon et al., 2020).

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Clustering Redshifts Technique.