Dark Standard Sirens

Updated 28 May 2026

Dark standard sirens are gravitational-wave sources from mergers without electromagnetic counterparts that enable direct luminosity distance measurements for cosmology.
They employ statistical methods, including galaxy-catalog associations and population modeling, to infer redshift distributions and minimize selection biases.
Upcoming GW detector networks promise percent-level precision on cosmological parameters, offering stringent tests of dark energy, dark matter interactions, and modified gravity.

Dark standard sirens are gravitational-wave (GW) sources—typically compact binary coalescences such as binary black hole (BBH), binary neutron star (BNS), or neutron star–black hole (NSBH) mergers—for which an electromagnetic (EM) counterpart is not detected. In the absence of an EM signal, direct redshift measurement of the source is unattainable; instead, cosmological inference proceeds statistically by cross-referencing the GW-inferred luminosity distance with galaxy redshift catalogs or host population models. The statistical association between GW-inferred distances and galaxy redshifts allows dark sirens to function as standardizable distance indicators (“standard sirens”) for constraining cosmic expansion, testing gravity, and probing the dark sector—including the nature of dark energy and dark matter–dark energy interactions.

1. Theoretical Foundation and Motivation

Dark standard sirens exploit the property that GW strain amplitudes encode the absolute luminosity distance, $d_L(z)$ , independently of the cosmic distance ladder or other astrophysical calibrators. The propagation of GWs through cosmological distances is governed in General Relativity (GR) by a wave equation whose amplitude decay is set by the expansion history. In alternative gravity or dark energy models, the relation between $d_L(z)$ and $z$ may be modified by time-varying “friction” terms (e.g., varying Planck mass, modified propagation, or non-minimal couplings) so that the GW luminosity distance, $d_L^{\rm GW}(z)$ , may differ from the electromagnetic distance, $d_L^{\rm EM}(z)$ , providing a direct probe of modifications to GR (Mukherjee et al., 2020, Belgacem et al., 2018, Wolf et al., 2019).

Dark sirens are of particular significance because the volumetric rate of BBH mergers is orders of magnitude higher than that of BNS mergers with detectable EM counterparts; consequently, the dark-siren sample size achievable by ground-based GW detector networks (LIGO, Virgo, KAGRA, and future third-generation facilities such as the Einstein Telescope and Cosmic Explorer) and space-based observatories (e.g., LISA, SKA-PTA) will rapidly exceed bright siren samples, enabling percent-level constraints on cosmological parameters (Matos, 2024, Jin et al., 2023, Dang et al., 25 Dec 2025, Bachega et al., 2019, Yan et al., 2019).

2. Statistical Methodologies for Cosmological Inference

A dark siren measures $d_L$ with corresponding uncertainty, but not its redshift $z$ . The statistical inference of cosmology from such a sample requires integrating (marginalizing) over all possible host redshifts. Two broad approaches are adopted:

(a) Galaxy-catalog association methods:

The GW localization volume is intersected with 3D galaxy catalogs (possessing sky positions and redshifts). Probability weights are assigned to potential hosts based on properties such as proximity in sky-location ( $\mathbf{\Omega}$ ), agreement between catalog-galaxy redshift and GW-inferred distance at trial cosmology, and galaxy luminosities or stellar mass as proxies for merger rates (Alfradique et al., 24 Mar 2025, Dang et al., 25 Dec 2025, Naveed et al., 16 May 2025, Turski et al., 19 May 2025, Matos, 2024, Jin et al., 2023). The joint likelihood for $H_0$ (or $\mathbf{\Omega}$ ) from $d_L(z)$ 0 events is

$d_L(z)$ 1

where $d_L(z)$ 2 are host weights, $d_L(z)$ 3 is the GW posterior at the candidate galaxy's $d_L(z)$ 4, and $d_L(z)$ 5 is the selection normalization.

(b) Population/statistical methods:

Where catalogs are incomplete or inaccessible at high $d_L(z)$ 6, one constructs a prior $d_L(z)$ 7 (or $d_L(z)$ 8) using parametric or non-parametric population models for the host galaxies and merger rates, sometimes marginalizing over population hyper-parameters (e.g., via hierarchical Bayesian inference) (Yu et al., 2022, Wang et al., 2024, Ferri et al., 2024, Zhang et al., 2019). An alternative, catalog-independent approach leverages the angular cross-correlation of GW events (in $d_L(z)$ 9 shells) with galaxy surveys (in $z$ 0 slices), maximizing when $z$ 1 and $z$ 2 correspond to the true cosmological mapping ("Peak Sirens") (Ferri et al., 2024).

Both approaches require accurate characterization of GW detection probabilities (selection functions) and detailed modeling of catalog completeness and measurement uncertainties.

3. Catalog Completeness, Selection Effects, and Systematics

Galaxy catalogs (e.g., GLADE+, HETDEX, DESI) are magnitude-limited and incomplete at high redshift, leading to systematic biases in the statistical redshift association (Turski et al., 19 May 2025, Naveed et al., 16 May 2025, Alfradique et al., 24 Mar 2025, Wang et al., 2024). The completeness as a function of redshift is modeled using Schechter luminosity functions: $z$ 3 with redshift evolution in $z$ 4 and $z$ 5 critical at $z$ 6 (Turski et al., 19 May 2025). Catalog incompleteness is mitigated via:

Out-of-catalog corrections using the luminosity function and host probabilities for galaxies fainter than the survey threshold.
Selection functions $z$ 7 encoding the detectability of both host galaxies and GW events.
Volume-limited subcatalogs, bright-galaxy subsamples (BCGs, LRGs) to maximize completeness at depth (Naveed et al., 16 May 2025).

Misestimation of the merger rate evolution, or misweighting of host probabilities (e.g., assuming equal probability for all galaxies vs. stellar-mass or star-formation rate weighting), introduces percent-level biases in $z$ 8, especially as statistical errors are reduced to the $z$ 9 regime with $d_L^{\rm GW}(z)$ 0 events (Alfradique et al., 24 Mar 2025, Yu et al., 2022).

4. Precision Forecasts and Robustness

Current and next-generation GW detector networks project the following levels of precision for $d_L^{\rm GW}(z)$ 1 with dark sirens:

O4/O5 (Advanced LVK): With stellar mass weighting and near-complete catalogs ( $d_L^{\rm GW}(z)$ 2, $d_L^{\rm GW}(z)$ 3), $d_L^{\rm GW}(z)$ 4 for 100 BNS events; degrades to 6% with equal weighting (Alfradique et al., 24 Mar 2025, Matos, 2024).
HETDEX+VIRUS silver/golden sirens ( $d_L^{\rm GW}(z)$ 5): For 25 events, precision of $d_L^{\rm GW}(z)$ 6; for only golden sirens ( $d_L^{\rm GW}(z)$ 7 events), $d_L^{\rm GW}(z)$ 8 (Dang et al., 25 Dec 2025).
Voyager/NEMO networks: $d_L^{\rm GW}(z)$ 990 events in 10 years yield $d_L^{\rm EM}(z)$ 0 precision (dark sirens only), $d_L^{\rm EM}(z)$ 1 when including bright sirens (Jin et al., 2023).
Einstein Telescope/Cosmic Explorer (3G): $d_L^{\rm EM}(z)$ 2 events per year, projected percent/sub-percent statistical errors on $d_L^{\rm EM}(z)$ 3, conditional on knowledge of the galaxy mass function redshift evolution to $d_L^{\rm EM}(z)$ 4 (Wang et al., 2024). At this level, systematics, such as population modeling errors or catalog incompleteness, can dominate over statistics (Yu et al., 2022, Wang et al., 2024).

“Peak Sirens” cross-correlation approaches show $d_L^{\rm EM}(z)$ 54–7% $d_L^{\rm EM}(z)$ 6 errors in LVK O5, moving to 0.6–1% with ET+2CE, and 3% on $d_L^{\rm EM}(z)$ 7, even under nonlinear structure, lensing, and masking uncertainties (Ferri et al., 2024).

5. Dark Sector Physics and Modified Gravity Constraints

Dark sirens enable competitive—and in some models, superior—constraints on dark matter–dark energy interactions and modified gravity:

Interacting DM–DE models (with coupling $d_L^{\rm EM}(z)$ 8): Addition of $d_L^{\rm EM}(z)$ 91000 GW events can reduce 1 $d_L$ 0 constraints on the coupling parameter $d_L$ 1 by factors of $d_L$ 2 compared to CMB alone, achieving $d_L$ 3 accuracy for $d_L$ 4 (Yang et al., 2019, Bachega et al., 2019, Li et al., 2023, Bonilla et al., 2021, Cai et al., 2017).
Tests of GW propagation: Modifications parameterized via $d_L$ 5 can be measured to $d_L$ 6 (or below 1% with 3G) using $d_L$ 7 dark sirens (Mukherjee et al., 2020, Belgacem et al., 2018, Wolf et al., 2019).
Gaussian-process methods enable model-independent detection or constraints on dynamic or interacting dark-sector physics, leveraging the Hubble diagram reconstructed from many dark sirens (Bonilla et al., 2021, Cai et al., 2017).

6. Methodological Innovation and Future Prospects

Recent directions include:

Population-based ("statistical dark siren") methods that do not depend on identifying host galaxies, but rather match observed $d_L$ 8 distributions to theoretical BBH population models, calibrated with hierarchical Bayesian or Fisher-matrix/CR bound techniques (Yu et al., 2022).
Incorporation of neutron star tidal effects (“love sirens”) for redshift extraction without EM counterparts (Li et al., 2023).
Hybrid “bright galaxy subset” strategies using only luminous galaxies to optimize catalog completeness at high redshift (Naveed et al., 16 May 2025).
Purely model-independent standard-ruler construction via cross-correlation of GW event positions and galaxy maps (“Peak Sirens”), resilient to population and catalog systematics (Ferri et al., 2024).

With the expected volume of detections from 3G GW detectors and Stage IV optical and radio surveys (LSST, DESI, Euclid, SKA), dark standard sirens will become foundational for precision cosmology and for probing the physics of cosmic acceleration. Achieving sub-percent accuracy, however, will demand stringent control of galaxy population modeling, selection effects, and systematic uncertainties inherent to both GW and catalog surveys (Wang et al., 2024, Alfradique et al., 24 Mar 2025, Yu et al., 2022).