Gravity Spy: Glitch Classification in GW Data

Updated 24 August 2025

Gravity Spy is a comprehensive framework that integrates citizen science with machine learning to classify transient glitches in gravitational-wave detector data.
It uses deep convolutional neural networks and probabilistic aggregation of volunteer annotations for efficient classification of diverse glitch morphologies.
The project improves detector characterization by enabling real-time glitch identification, guiding data quality enhancements and supporting astrophysical discoveries.

The Gravity Spy Project is a comprehensive framework for the classification and characterization of transient noise artifacts—known as “glitches”—in gravitational-wave detector data, primarily from the Advanced LIGO and Virgo observatories. By merging the efforts of citizen-science volunteers with advanced machine-learning (ML) techniques, Gravity Spy addresses both the scale and complexity of glitch identification, enabling more reliable gravitational-wave searches and improved detector characterization.

1. Citizen Science–Machine Learning Integration

Gravity Spy employs a dual approach leveraging human pattern recognition and ML automation. Volunteers, recruited through the Zooniverse platform, classify Omega scan spectrogram images of glitches into known or novel morphological classes. In tandem, deep convolutional neural networks (CNNs) are trained on the collective expert and volunteer-provided labels to assign probability scores to glitches across up to 20 predefined classes (Zevin et al., 2016).

The workflow incorporates a symbiotic loop:

Volunteers provide large, manually labeled training sets and excel at identifying new glitch phenomena.
ML models, notably CNNs, ingest and learn from these classifications, processing millions of images rapidly and flagging ambiguous cases for further human review.
The ML system informs and trains volunteers by routing images according to classification confidence, deploying beginning, intermediate, and advanced workflows to optimize volunteer progression.

A combined classifier fuses ML confidence vectors with citizen annotations through probabilistic aggregation, formally given by:

$\tilde{y}_i = \arg\max_{j} \frac{p(y_i^{cr}=j\,|\,\hat{y}_i^1, \cdots, \hat{y}_i^{R_i}) + \mathbf{p}^{ML}_i(j)}{\sum_{j=1}^{C} \left(p(y_i^{cr}=j\,|\,\hat{y}_i^1, \cdots, \hat{y}_i^{R_i}) + \mathbf{p}^{ML}_i(j)\right)}$

This method enhances retirement efficiency and classifier adaptation as detector behavior evolves (Zevin et al., 2016, Zevin et al., 2023).

2. Machine Learning Architectures and Algorithms

Gravity Spy’s primary classification engine is a deep CNN trained on Omega scan inputs representing combined views across four time durations (typically 0.5–4.0 s), capturing the diverse morphologies of glitches. The CNN architecture includes:

Stacked convolutional layers with max-pooling and ReLU activations ( $\text{ReLU}(x) = \max(0, x)$ ),
Fully connected layers,
A softmax output producing probability vectors over all classes.

The softmax output for image $i$ is:

$o_i^c = \frac{e^{w_c^T x}}{\sum_{c=1}^{C} e^{w_c^T x}}$

where $x$ is the penultimate layer output and $w_c$ is the weight vector for class $c$ ( $C$ total classes).

The cross-entropy loss minimized during training is:

$\text{loss} = -\sum_{j=1}^{N} \sum_{c=1}^{C} y_j^c \log o_j^c$

where $y_j^c$ indicates actual labels.

Input representation uses the Q-transform, with normalized energy:

$Z = \frac{|X|^2}{\langle |X|^2 \rangle}$

where $|X|$ denotes Q-transform magnitude and $\langle |X|^2 \rangle$ is the mean-squared value under stationary noise (Zevin et al., 2016).

Recent advances include intermediate (feature-level) multi-view fusion, inception residual blocks, label smoothing, and attention-based weighting. The attention mechanism on fused features uses weights $\alpha_i$ assigned via gated matrices $V$ and $U$ :

$\alpha_i = \frac{\exp\{w^{\top}[\tanh(Vz_i^{\top}) \odot \sigma(Uz_i^{\top})]\}}{\sum_j \exp\{w^{\top}[\tanh(Vz_j^{\top}) \odot \sigma(Uz_j^{\top})]\}}$

and the final feature vector as $\mathcal{M}_{\text{Att}} = \sum_{i} \alpha_i z_i$ (Wu et al., 23 Jan 2024).

3. Unsupervised and Similarity-based Glitch Discovery

Beyond supervised CNN classification, Gravity Spy incorporates unsupervised and similarity-learning methodologies to discover novel glitches:

Similarity learning employs the DIRECT algorithm—using a VGG16 + dense layer mapping images into a 200-dimensional space and a contrastive loss:

$L = \sum_{i=1}^N [y^i \cdot \text{dist}(f_\theta(x_1^i), f_\theta(x_2^i)) + (1-y^i) \cdot \max\{0, m - \text{dist}(f_\theta(x_1^i), f_\theta(x_2^i))\} ]$

with $\text{dist}(\cdot,\cdot)$ cosine distance and $m$ a margin. Volunteers use an interactive similarity search tool to cluster unlabeled glitches into collections, accelerating discovery of rare events (Coughlin et al., 2019).

Unsupervised clustering approaches leverage VAEs for dimensionality reduction and invariant information clustering (IIC) to maximize mutual information between perturbed image pairs:

$\max_\Phi I(\Phi(x), \Phi(x')) = \sum_{i=1}^C \sum_{j=1}^C P_{ij} \ln \frac{P_{ij}}{P_i P_j}$

where $P_{ij}$ denotes the joint probability of class assignments (Sakai et al., 2021, Sakai et al., 2022).

Experiments on Gravity Spy datasets have shown that unsupervised clusters align well with expert labels and can reveal latent subclasses, with clustering accuracy up to 90.9% (Sakai et al., 2022).

4. Operational Deployment and Data Quality Findings

Gravity Spy’s ML models are continuously retrained and expanded as new glitch morphologies are discovered.

As of O3, the CNN model classified 233,981 glitches in Hanford and 379,805 in Livingston across 23 classes, with confidence thresholds informing class purity (Glanzer et al., 2022).
Glitch rates differ significantly between detectors: for instance, Livingston exhibits vastly higher rates of Fast Scattering glitches, attributed to site-specific environmental conditions.
Visualization of SNR distributions and weekly rates illustrates the need for site-tailored data quality studies and mitigation strategies.

The Gravity Spy catalog has been instrumental in guiding LIGO and Virgo detector characterization, with its classifications supporting efforts to reduce dominant glitch classes via commissioning (Glanzer et al., 2022, Zevin et al., 2023).

5. Specialized Signal-vs-Glitch Classification

For candidate event validation, Gravity Spy has evolved dedicated signal-vs-glitch classifiers:

Decision-tree architectures (GSpyNetTree) dispatch candidate events by total mass to specialized CNNs for low, high, and extremely high mass ranges. The EHM classifier can stretch spectral features using a Mercator projection for finer discrimination (Alvarez-Lopez et al., 2023, Jarov et al., 2023).
Augmented and balanced training sets (including random time offset augmentation) have improved GW signal detection accuracy from 52% (original model) to 97%, with glitch classification reaching 99%.
These methods have demonstrated high reliability in O3b, correctly identifying 100% of confirmed astrophysical events and 75% of retracted candidates as non-astrophysical (Jarov et al., 2023).

A current limitation is that overlapping signal–glitch scenarios can challenge these models; multi-label classifier frameworks are under development to address such cases (Alvarez-Lopez et al., 2023).

6. Expansion Across Observatories and Taxonomy Evolution

Gravity Spy’s taxonomy and methods have expanded to other detectors, notably KAGRA:

In the O3GK analysis, the hierarchical veto (Hveto) algorithm was used to statistically correlate main channel glitches with auxiliary subsystems, employing Q-transform-based spectrograms for visual examination (Akutsu et al., 26 Jun 2025).
Morphological classification adapted from Gravity Spy has identified four traditional classes and two novel types—“dot” and “line”—unique to KAGRA, based on time–frequency morphology. This extension of the taxonomy provides crucial context for coordinated glitch studies across the global GW detector network.

Visual similarity and frequency-domain coherence analysis link glitches across channels, informing subsystem diagnostics and data quality improvement. The discovery of KAGRA-specific glitch types suggests necessary updates to Gravity Spy training sets and methodologies (Akutsu et al., 26 Jun 2025).

7. Future Directions and Advanced Volunteer Involvement

Gravity Spy continues to evolve in both technical and community-science dimensions:

Gravity Spy 2.0 will introduce volunteer tasks focused on cross-channel similarity, curated collections, and network analysis linking glitches to subsystem causes (Zevin et al., 2023).
Advanced interfaces, training workflows, and toolkits—including similarity search and ensemble clustering—promote discovery at a level comparable to domain experts.
Ongoing improvements in classifier architecture—such as inception residual blocks, feature-level fusion, label smoothing, and attention modules—promise enhanced accuracy, interpretability, and detector sensitivity (Wu et al., 23 Jan 2024).

A plausible implication is that the increasing sophistication of Gravity Spy’s approach will enable real-time online flagging of novel glitches and deep involvement of volunteers in causal analysis, supporting the long-term goal of maximizing gravitational-wave discovery potential through improved data quality and noise mitigation.

In sum, Gravity Spy exemplifies a multifaceted strategy for glitch classification and discovery in gravitational-wave data, uniting citizen science and advanced ML to address evolving challenges in detector characterization, data quality, and astrophysical event validation (Zevin et al., 2016, Coughlin et al., 2019, Sakai et al., 2021, Bahaadini et al., 2022, Zevin et al., 2023, Wu et al., 23 Jan 2024, Akutsu et al., 26 Jun 2025).