CNN-RNN Earthquake Detector (CRED)

Updated 4 February 2026

The paper presents CRED, a 12-layer deep residual network combining convolutional neural networks and Bi-LSTM layers for detecting weak seismic events.
It processes three-component STFT spectrograms to achieve over 99% recall and precision exceeding 96% while maintaining low false positive rates.
The approach generalizes across regions and outperforms traditional methods like STA/LTA and template matching, enabling scalable real-time seismic monitoring.

The Cnn-Rnn Earthquake Detector (CRED) is a deep learning-based framework for earthquake signal detection that leverages the joint capacity of convolutional neural networks (CNNs) and recurrent neural networks (RNNs) within a residual architecture. Developed to enhance the detection of small, weak, and variable earthquake signals in noisy recordings, CRED is distinguished by its use of bi-directional long-short-term memory (Bi-LSTM) units and deep convolutional layers to analyze the time-frequency characteristics of three-component seismograms at single stations. The method has demonstrated high sensitivity, robust performance under variable noise, and operational efficiency for large-scale seismic monitoring tasks (Mousavi et al., 2018).

1. Network Architecture

CRED is implemented as a 12-layer residual deep neural network with distinct convolutional and recurrent blocks:

Input: Three-component @@@@2@@@@ spectrograms derived from 30 s seismograms at 100 Hz (band-pass filtered from 1–45 Hz).
Convolutional Residual Blocks: The first six layers comprise three residual stages. Each stage contains two residual blocks, each with two convolutional layers (preceded by Batch-Normalization and ReLU, with shortcut connections $\mathbf{x}^{(l+1)} = \mathbf{x}^{(l)} + F(\mathbf{x}^{(l)}, \{W^{(l)}\})$ where $F$ $F$ encapsulates the block’s convolutional transformations).
- Stage 1: 8 filters (3×3 kernels), preceded by a down-sampling convolution (8 filters, 9×9 kernel, stride 2×2)
- Stage 2: 16 filters (3×3), preceded by 16 filters (5×5, stride 2×2)
- Stage 3: 32 filters per block (3×3), doubling from Stage 2
- “Same” padding is used, so spatial resolution reduction is confined to the stride-2 layers.
Sequence Redistribution: The final convolutional output tensor [Time×Freq×Channels] is reshaped into a sequence along the temporal axis for recurrent processing.
Recurrent (Bi-LSTM) Residual Blocks: Two residual blocks, each with two stacked bi-directional LSTM layers (64 units per direction), implemented with identity shortcuts. A final unidirectional LSTM (128 units) addresses causal requirements in real-time processing.
- LSTM gates for time step $t$ :
$\begin{aligned} i_t &= \sigma(W_i x_t + U_i h_{t-1} + b_i) \ f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f) \ o_t &= \sigma(W_o x_t + U_o h_{t-1} + b_o) \ \tilde{c}_t &= \tanh(W_c x_t + U_c h_{t-1} + b_c) \ c_t &= f_t \odot c_{t-1} + i_t \odot \tilde{c}_t \ h_t &= o_t \odot \tanh(c_t) \end{aligned}$
Classification Layers: Two fully-connected layers—first with 128 units (ReLU, dropout $p=0.5$ ), followed by a 1-unit sigmoid layer outputting probability estimates for each time sample.
Parameter Count: Approximately 256,000 trainable parameters.

2. Training Methodology

CRED was trained on 500,000 labeled three-component seismograms of 30 seconds duration each, collected in Northern California (1987–2017), equally split between earthquakes (magnitude 0–5, with S/P picks from NCEDC) and noise traces. Noise traces encompassed both ambient and non-ambient noise, with data cleaning performed as in Mousavi & Langston (2017). Preprocessing consisted of detrending, mean normalization, band-pass filtering (1–45 Hz), resampling to 100 Hz, and STFT computation. Labels assigned a binary vector with ones from P-arrival to P + 3d (where $d=$ S–P), zeros elsewhere.

The dataset was divided 80%/10%/10% into training, validation, and test partitions ( $N=50,000$ in each of validation and test). The binary cross-entropy loss

$L = -\frac{1}{N} \sum_{n=1}^N \bigl[ y_n\log \hat{y}_n + (1 - y_n)\log(1 - \hat{y}_n) \bigr]$

was minimized using the Adam optimizer (default $\beta_1=0.9$ , $\beta_2=0.999$ ), with batch size 128, and up to 62 epochs (training halted after ~42 epochs if no validation accuracy improvement). Regularization comprised dropout ( $p=0.5$ ) in dense layers, Batch-Normalization in convolutional blocks, and $L_2$ decay ( $\approx$ 1e $^{-4}$ , implicit in BN).

3. Detection Performance and Benchmarking

Evaluation used threshold-based metrics:

Precision $=\mathrm{TP}/(\mathrm{TP}+\mathrm{FP})$
Recall $=\mathrm{TP}/(\mathrm{TP}+\mathrm{FN})$
$F$ -score $=2\,\mathrm{Precision}\,\mathrm{Recall}/(\mathrm{Precision}+\mathrm{Recall})$

On a held-out Northern California test set (50,000 traces), for detection thresholds $0.1 \leq tr \leq 0.3$ , CRED achieved $\mathrm{Precision}>96\%$ , $\mathrm{Recall}>99\%$ , and $\max F=99.95\%$ (at $tr=0.11$ ). At this operating point, the confusion matrix is: | TP | FP | FN | TN | |------|-----|-----|-------| | 25226| 24 | 26 | 24745 | yielding a false positive rate of $24/(24745+24)\approx 0.097\%$ and a false negative rate of $26/25252\approx 0.10\%$ .

Robustness was evaluated on semi-synthetic continuous data (500 real events and 500 Ricker wavelets embedded at 23 SNRs from $-2$ to $20$ dB). Compared to STA/LTA and template matching:

CRED: 100% TPR for SNR $\geq 12$ dB, 0% FPR at all SNRs; at 7 dB, TPR=80%
STA/LTA: TPR=27% at 7 dB, FPR $\sim$ 73% at high SNR
Template matching: TPR=3% at 7 dB, TPR $\sim$ 82% at 20 dB, but FPR $\approx$ 0

4. Application to Continuous Seismic Data

CRED was deployed on one month of continuous three-component data from station WHAR (Central Arkansas, August 2010, reflecting a hydraulic fracturing induced-seismicity sequence). Processing used a sliding window of 30 s (step 5 s): each window underwent STFT, CRED inference, and sample-wise probability output aggregation for event identification.

On a standard laptop (2.7 GHz Core i7, 16 GB RAM), the complete month ( $\sim$ 2 million windows) was processed in approximately 1 h 9 min (including STFT and forward pass). Batch inference on GPU hardware permits real-time or faster throughput, suggesting scalability for large-N seismic networks.

5. Comparative Analysis

Detection effectiveness was compared with STA/LTA, template matching, and FAST algorithms:

Method	Detected Events (Aug 2010)	False Positives	Throughput
STA/LTA (ANSS)	≈23	High	Fast
Template Matching	≈3732	Low	Slow (long records)
FAST	3266	Not specified	Computationally expensive (~100× slower than CRED, non-parallel)
CRED	1102	345	1hr 9min/month (laptop); real-time (GPU)

Of the 1102 CRED detections, 680 matched cataloged events, 77 were visually validated as new, and 345 were classified as false positives (precision ≈ 69%). Detected magnitudes extended to $M_L\approx -1.3$ at 1–3 km. The magnitude–frequency distribution demonstrates that CRED expands the lower detection threshold compared to template and FAST approaches (Mousavi et al., 2018).

6. Generalization and Prospects

CRED, trained solely on Californian tectonic data (magnitudes 0–5, source distances ~50 km), effectively detected Arkansas microearthquakes (ML from −1.3 to 2.6, distances <5 km), even across substantial crustal differences and operational contexts. This suggests high model generalization and transferability, a notable result given the conventional challenges in cross-region seismic detection.

Potential future enhancements include:

Retraining with more diverse, multi-region, and deeper datasets.
Algorithmic “recursive learning” via incremental retraining incorporating newly-detected, validated events.
Extension to multi-station array processing and integration of polarization features.
Large-N, real-time deployment leveraging GPUs and edge computing.
Combining CRED-based detection with phase picking (e.g., PhaseNet) and magnitude estimation for an end-to-end seismic analysis pipeline.

CRED’s architecture—merging deep residual CNNs for feature extraction and Bi-LSTM layers for modeling temporal dependencies—facilitates robust, accurate detection of low-amplitude seismic events with minimal false positives, outperforming STA/LTA in sensitivity and template matching in computational efficiency (Mousavi et al., 2018).

Markdown Report Issue Upgrade to Chat

References (1)

CRED: A Deep Residual Network of Convolutional and Recurrent Units for Earthquake Signal Detection (2018)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Cnn-Rnn Earthquake Detector (CRED).

CNN-RNN Earthquake Detector (CRED)

1. Network Architecture

2. Training Methodology

3. Detection Performance and Benchmarking

4. Application to Continuous Seismic Data

5. Comparative Analysis

6. Generalization and Prospects

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

CNN-RNN Earthquake Detector (CRED)

1. Network Architecture

2. Training Methodology

3. Detection Performance and Benchmarking

4. Application to Continuous Seismic Data

5. Comparative Analysis

6. Generalization and Prospects

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research