DES-Dovekie: Segmentation & Calibration
- DES-Dovekie is a dual-framework that applies mixture models for semi-supervised segmentation of Dovekie vocalizations and precise photometric calibration in cosmology.
- It leverages von Mises–Fisher distributions to construct directional embeddings and achieve high performance metrics even under noisy conditions.
- In cosmology, the method rigorously propagates filter uncertainties into supernova parameter inferences, improving calibration for DES and Pantheon+ surveys.
DES-Dovekie refers to two rigorous, technically distinct frameworks at the intersection of signal segmentation in bioacoustics and precision calibration in cosmic distance scale studies. In avian signal processing, DES (Directional Embedding-based Semi-supervised segmentation) has been adapted for robust segmentation of Dovekie (Alle alle) vocalizations using mixture models of von Mises–Fisher distributions. Separately, in cosmology, Dovekie denotes a cross-calibration solution that quantifies and controls photometric calibration uncertainties in the DES 5YR and related supernova surveys, crucially propagating these uncertainties into cosmological parameter inference. Both usages exemplify the application of advanced statistical methodology and open-source infrastructure to challenging real-world measurement domains.
1. Directional Embedding-based Semi-supervised Segmentation for Dovekie Vocalizations
The DES framework for bird vocalization segmentation is constructed as a data-efficient, semi-supervised, two-pass pipeline, designed to discriminate bird calls from complex acoustic backgrounds. Its extension to Dovekie calls involves the following sequential stages:
- Reference Directional Model via moVMF:
- Construct “super-frames” by concatenating consecutive STFT magnitude frames, forming unit -norm vectors in .
- Fit a -component mixture of von Mises–Fisher distributions (moVMF) to the distribution of super-frames representing Dovekie vocalizations. This is formalized as , where parameters are estimated via an EM algorithm maximizing the likelihood over examples from labeled bird frames, and poorly-concentrated components (low ) are pruned.
- Directional Embedding (DE) Construction:
- The learned mean directions define the columns of a dictionary matrix .
- For any new recording, super-frames are projected onto : , where contains unit-norm super-frames as columns. High-magnitude DE responses indicate the presence of calls.
- Two-pass Semi-supervised Segmentation:
- Pass 1: Automatically infer training labels by computing the mutual information (MI) between softmax-normalized consecutive DE vectors. Low-MI frames are labeled “bird,” high-MI as “background,” forming a balanced labeled set of size $2Q$ (e.g., ).
- Pass 2: Train a discriminative classifier (support vector machine with cubic polynomial kernel, default ) using these auto-labeled DEs and classify all frames in the recording.
Dovekie-specific adaptation involves tuning the temporal context for call length, selecting –15 moVMF components, and calibrating to the call/background balance in field data.
2. Performance Metrics and Benchmarking for DES-Dovekie
The framework is validated using frame-level metrics: Precision, Recall, and -score, evaluated over 79,000 vocalizations from seven bird species. For Dovekie, analogous procedures are followed, yielding average clean-condition -scores in the range 0.77–0.83 on test species using cross-validation. Robustness to additive acoustic noise is demonstrated by a relative drop of only 3.6% even as SNR ranges from 20 to 0 dB, outperforming unsupervised methods by a wide margin (which suffer 12–17% drops). Cross-species generalization is achieved with only minor loss (–$0.81$ versus $0.80$–$0.83$), validating the approach’s genericity.
Key hyperparameters from the application to Dovekie are:
| Parameter | Value/Range | Description |
|---|---|---|
| STFT window | 20 ms, 50% overlap | Baseline time–frequency analysis configuration |
| w (context) | 3–5 | Number of frames in a super-frame (adjust for call length) |
| Z (moVMF) | 10 (after pruning) | Number of reference directions |
| Q | 1000–3000 | Frames per class for Pass 1 auto-labeling |
| Classifier | SVM, cubic kernel | Classifies DEs in Pass 2 |
A plausible implication is that this framework achieves competitive performance with minimal human-labeled data and can be quickly adapted to new species with modest parameter tuning.
3. Dovekie Cross-Calibration for DES 5YR Cosmology
In the context of supernova cosmology, Dovekie implements an open-source photometric cross-calibration method addressing the challenge of color-dependent zeropoint errors across surveys (CfA3/4, CSP, SDSS, SNLS, PS1, Foundation, DES). The method models each survey filter’s effective transmission and allows perturbations . Observed magnitude residuals are modeled as:
Filter-shape uncertainties are parameterized either as rigid shifts () or as photon- versus energy-weighted transformations. Dovekie’s innovation lies in quantifying uncertainties on both zeropoints and passband shapes for all filters, informed by direct DA white dwarf observations (Boyd et al., 2025) as well as standard stars (CALSPEC, NGSL2).
The pipeline proceeds through:
- Filter slope estimation (color–magnitude regression using synthetic and real catalogs)
- Tertiary-star and DA WD likelihood estimation
- Simulation-based bias correction on fitted slopes
- Joint MCMC inference of zeropoint offsets using the No-U-Turn Sampler
This enables rigorous error propagation into cosmological inferences from supernova Hubble diagrams.
4. Quantitative Impact on DES and Pantheon+ Samples
Calibration refinements to filter shapes and zeropoints drive concrete improvements:
- Out of 52 filters, 19 required shifts up to Å; e.g., SNLS were shifted +30 Å.
- New systematic photometric uncertainty for Pantheon+ Flat-CDM: , a reduction versus previous (0.023).
- Typical zeropoint errors now reach 0.003 mag for PS1/DES/SDSS/SNLS.
- Sensitivity of distance modulus to calibration slope: .
- Systematic shift in in flat CDM: .
- Initial and uncertainties: , .
Model retraining using SALT3 surfaces propagates these calibration changes, resulting in up to amplification of small calibration errors in the final Hubble diagram due to the color-luminosity relation’s sensitivity. Even 5 mmag calibration tilts can induce 15 mmag distance offsets after Tripp estimator application.
5. Pipeline Infrastructure and Survey Extension
The Dovekie software repository (https://github.com/bap37/Dovekie/) provides curated ASCII bandpasses, tertiary and DA WD catalogs, YAML-based survey configuration, Gaia XP ingestion, full simulation tools, and the Markov Chain Monte Carlo calibration pipeline. SALT3 retraining is supported via configuration files, enabling systematic propagation.
To add a new survey:
- Import its filter response files.
- Provide tertiary catalog matched to PS1 photometry.
- Update the YAML configuration; optionally provide DA WD observations.
- Rerun the Dovekie pipeline to derive new shifts, zeropoints, and covariance matrices.
- Propagate calibration through retrained SALT3 light-curve surfaces into cosmology.
This agnostic, reproducible architecture supports community-wide cross-survey harmonization and robust error accounting in precision cosmology.
6. Broader Context and Significance
In both fields, DES-Dovekie exemplifies the robust application of modern mixture-model inference, discriminative embedding construction, and rigorous uncertainty quantification. In bioacoustics, the transferability and resilience of the DES pipeline enable rapid adaptation to diverse species, including Dovekie, with minimal manual effort and strong noise robustness. In cosmology, intra- and inter-survey calibration accuracy directly determines the reliability of inferences about cosmic acceleration and the nature of dark energy; Dovekie provides the first public, extensible solution with fully-propagated uncertainties and direct DA WD anchoring. A plausible implication is that further progress in both areas will rely increasingly on end-to-end, modular, and fully-documented frameworks of this character, supporting open, collaborative scientific workflows.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days free