ML Approach for Earthquake Magnitude Estimation

Updated 1 October 2025

The paper introduces ML methods that integrate seismic catalogs and waveforms to achieve high-accuracy magnitude estimation, with RMSE as low as 0.2.
It details the use of deep architectures, including CNNs, LSTMs, GNNs, and transformers, to capture nonlinear seismic patterns and fuse multi-modal data.
The study highlights practical applications in rapid earthquake warning systems while addressing challenges like magnitude completeness and data scarcity.

A machine-learning approach for earthquake magnitude estimation refers to the use of statistical or deep learning methods to infer earthquake magnitude—usually expressed as moment magnitude (Mw) or local magnitude (ML)—from various data sources. Traditionally, magnitudes are estimated from direct analysis of seismic waveforms. Machine learning (ML) offers alternatives and enhancements by leveraging patterns in raw or derived seismic, geodetic, and remote sensing data, often capturing nonlinear relationships not addressed by standard physical models. Modern research demonstrates that ML-based models can supplement or surpass classical regression, catalog exploration, and physics-based inversion strategies when appropriately trained and validated.

1. Methodological Foundations

Machine learning for earthquake magnitude estimation employs a range of architectures, data representations, and learning objectives.

Catalog-Based Regression: Random Forest (RF) regression models can predict macroscopic fault properties—including magnitude proxies—using time-binned features derived from earthquake catalogs. Features summarize counts and amplitudes above magnitude thresholds, capturing the statistical distribution of event sizes within a sliding or fixed window. Predictive performance depends critically on the 'magnitude of completeness', i.e., the minimum event size reliably cataloged. In laboratory settings, below a catalog completeness threshold ( $M_{cut} \approx 2.0$ ), additional small events do not improve model accuracy, while omitting events above this completeness sharply degrades regression performance. Well-resolved laboratory catalogs showed $R^2 \sim 0.9$ for shear stress and fault-state estimation (Lubbers et al., 2018).
Supervised Deep Learning on Waveforms: Deep neural networks combining convolutional (CNN) and recurrent (LSTM/BiLSTM) layers can regress directly from raw multi-component seismograms to a continuous magnitude estimate. Notably, networks that do not normalize input amplitude preserve essential scaling information, and LSTM units' gating mechanisms provide robustness to unnormalized dynamic signals. Models demonstrate near-zero mean error and a standard deviation of $\sim$ 0.2 when estimating magnitude from single-station waveforms without instrument correction (Mousavi et al., 2019).
Graph and Transformer Architectures: Transformer networks (TEAM-LM) aggregate features from variable sets of seismic stations using self-attention, positional encoding, and mixture density regression. Graph neural networks (GNNs) treat each seismic station and model parameter combination as a node in a product graph, passing messages to infer posterior estimates over magnitude and location. Both techniques excel in fusing distributed information, provide resilience against missing data, and support probabilistic magnitude estimation with uncertainty quantification (Münchmeyer et al., 2021, McBrearty et al., 2022).
Feature-Based Shallow Regression: Gradient boosting decision tree ensembles, Extreme Learning Machines (ELM), and interpretable additive networks can leverage engineered features based on seismological laws (e.g., RTL features, Gutenberg–Richter statistics, energy scaling, recurrence times) for magnitude discrimination or regression. Feature selection via maximum relevance and minimum redundancy (mRMR) filters optimize non-redundant, physically meaningful predictors. Models achieve low root mean square error (RMSE $\sim$ 0.008–0.1) in robust regional applications (Proskura et al., 2019, Baveja et al., 2023).
Satellite and Geodetic Data: ML models based on bitemporal Sentinel-1 imagery reformulate magnitude estimation as a joint regression and ranking (metric-learning) problem. Pairwise margin ranking losses supplement regression loss, improving discrimination between similar-magnitude events in low-data regimes and yielding a $\geq$ 30% reduction in Mean Absolute Error (MAE) compared to regression-alone baselines using modern transformer encoders (Cambrin et al., 25 Jul 2024). Deep CNNs and ResNet-like models have also been tailored to HR-GNSS (High-Rate Global Navigation Satellite System) displacement time series, with both single- and multi-station input tensors, delivering precise magnitude estimates even with unnormalized metric data (Cartaya et al., 2023, Quinteros-Cartaya et al., 26 Mar 2025).

2. Key Input Data, Feature Engineering, and Preprocessing

Seismic Catalogs and Waveforms

Catalog Features: Statistical summaries (counts and cumulative amplitudes above variable magnitude thresholds) over fixed time windows are the typical inputs for RF and boosting architectures, binned as:
- $X^{count}_{i,j} = \sum_k 1,\ \forall k: m_k^{(i)} \geq M_j$
- $X^{amp}_{i,j} = \sum_k 10^{m_k^{(i)}}$
- Windowing choices and threshold discretization affect the feature set size (e.g., 35–320 dimensions), and completeness thresholds ( $M_{cut}$ ) directly constrain predictive power (Lubbers et al., 2018).
Waveform-Based Input: For deep learning, the input is typically three-channel, preprocessed (detrended, filtered, but not amplitude-normalized) waveform data, windowed using fixed or dynamically-adapted intervals post-P-arrival or post-detection. In ResNet and CNN designs for HR-GNSS, input tensors are shaped as $(N_{stations} \times N_{samples} \times 3)$ , with zero-padding as required (Tsuchimochi, 13 Oct 2024).

Satellite and Geodetic Data

Remote Sensing: Features are extracted from consecutive satellite images (e.g., Sentinel-1), either directly or through temporal differencing, and processed using CNN or transformer encoders. The regression output is the estimated magnitude, and ranking loss is applied to pairs to enforce correct ordering of predicted magnitudes (Cambrin et al., 25 Jul 2024).
GNSS: Displacement time series (up to 390 seconds post-event) from multiple stations, possibly at varying distances from the epicenter, provide the basis for joint event detection and magnitude estimation workflows, supporting both rapid (real-time) and retrospective analyses (Quinteros-Cartaya et al., 26 Mar 2025).

Physics-Informed and Interpretable Features

Seismological Laws: Inputs may include parameters and summary statistics derived from classical relations:
- Gutenberg–Richter Law: $\log N = a - bM$
- Rupture Energy: $dE^{1/2} = (\sum 10^{10.8 + 1.5M})^{1/2} / T$
- Recurrence and deviation from expected scaling (used in ELM frameworks) (Baveja et al., 2023)
- RTL Features: $RTL(x,y,t,M) = R(x,y,t,M) \cdot T(x,y,t,M) \cdot L(x,y,t,M)$ (Proskura et al., 2019)
- Interpretable additive neural networks isolate contributions from physically meaningful pathways (e.g., magnitude scaling, attenuation, site response) (Sreenath et al., 26 Aug 2025).

3. Model Performance and Validation

Model performance is consistently benchmarked using regression metrics (e.g., $R^2$ , RMSE, MAE) or, for classifiers, accuracy, precision, recall, and F1-score.

Model/Study	Data Type	Input Modality	RMSE/MAE	Other Key Metrics
RF/catalog (Lubbers et al., 2018)	Lab AE catalog	Counts, amplitude windows	$R^2 \sim 0.9$	Ablation: sharp drop for $M_{cut} > 2.0$
Deep CNN+LSTM (Mousavi et al., 2019)	Waveform	3-ch 30s traces, no norm	$\sim$ 0.2 (std)	Mean error $\approx 0$
CNN/ResNet (Quinteros-Cartaya et al., 26 Mar 2025)	HR-GNSS	$3\times 390\times 3$ tensor	$\sim$ 0.05–0.07 (MAE)	Window length scaling
Metric Learning (Cambrin et al., 25 Jul 2024)	Sentinel-1	Bi-temporal imagery	+30% MAE improvement	Margin ranking loss
Additive+HazBinLoss (Sreenath et al., 26 Aug 2025)	Strong-motion	Engineered (e.g., $M_w$ , $R_{rup}$ )	MSE, MAE $\leq$ trad. GMM	Improved high-hazard bins

The best models consistently outperform traditional regression or threshold-based methods, particularly in low-data, high-magnitude, or rapid-response scenarios. The use of model uncertainty quantification (e.g., through mixture density outputs or ensemble calibration) is increasingly recommended for deployment in real-time or decision-critical applications (Münchmeyer et al., 2021).

4. Practical Applications and Limitations

Rapid Earthquake Early Warning and Monitoring

Machine learning magnitude estimators are increasingly integrated in real-time pipelines, e.g. using CNN-based earthquake detection (DetEQ), followed by magnitude regression from HR-GNSS (MagEs) with output delivered within seconds of detection (Quinteros-Cartaya et al., 26 Mar 2025). Such systems support immediate alerting, rapid initial source characterization, and detailed post-event cataloging (e.g., QuakeFlow with PhaseNet and GaMMA) (Zhu et al., 2022).

Magnitude Completeness and Real-World Constraints

ML estimators relying on catalog features are fundamentally limited by the magnitude of completeness: omitting small events degrades model fidelity by removing statistical signals necessary for robust magnitude (or fault state) inference (Lubbers et al., 2018). Field conditions add noise, limit bandwidth, and reduce event detectability compared to laboratory setups, requiring careful completeness analysis and transfer learning for field application.

Transferability and Generalization

Performance varies across tectonic regimes and sensor networks. Models trained in one region (e.g., the Chile subduction zone) display robust performance on data from comparable regions but may degrade for non-subduction settings. A plausible implication is that regional transfer learning or domain-adaptive fine-tuning is generally necessary for operational deployment (Quinteros-Cartaya et al., 26 Mar 2025, Cartaya et al., 2023).

Interpretability and Risk Assessment

Advances in interpretable ML for ground motion and magnitude estimation (e.g., additive models with independent pathways and monotonicity constraints) offer transparent, physics-conforming predictions required for risk assessment and disaster planning. The introduction of hazard-informed loss functions (HazBinLoss) specifically targets the traditionally underrepresented large-magnitude, near-field events, enhancing model reliability for critical engineering decisions (Sreenath et al., 26 Aug 2025).

5. Open Challenges and Future Directions

Key challenges and anticipated developments include:

Handling Data Scarcity: Low-data regimes, especially for large events, motivate the incorporation of regularization (e.g., ranking loss, data augmentation, synthetic training) and advanced architectures (self-attention, hybrid CNN-transformers) to maximize discriminative power (Cambrin et al., 25 Jul 2024).
Catalog and Sensor Completeness: Achieving and maintaining low magnitude of completeness in natural catalogs remains a bottleneck for catalog-based ML models. Improved instrumentation and event detection may partly address this.
Uncertainty Quantification: Widespread adoption of probabilistic output layers (e.g., mixture density networks) and rigorous calibration is recommended for operational deployment, especially for rare and high-impact scenarios (Münchmeyer et al., 2021).
Spatial and Multimodal Integration: Progress is expected in architectures that natively handle variable sensor arrays, spatially-distributed inputs, and the fusion of geodetic (GNSS), seismic, and remote sensing data.
Physics-Informed/Hybrid Learning: Embedding physical constraints—travel time consistency, monotonic path attenuation, explicit seismological priors—may improve generalization and provide physically plausible outputs, mitigating black-box behaviors and "fingerprinting" artifacts.
Real-Time Deployment: Comprehensive, cloud-optimized pipelines (with horizontal scaling and auto-orchestration) enable throughput-compatible with high-volume continuous data and deliver low-latency magnitude estimates at scale (Zhu et al., 2022).

6. Interpretability, Transparency, and Trust

Recent research emphasizes inherently interpretable ML models with additive, explicitly decomposable pathways, in contrast to black-box neural networks. Models leveraging the HazBinLoss demonstrate that weighting high-hazard, low-frequency records during training significantly mitigates underestimation of large, damaging earthquake magnitudes in risk assessment studies while maintaining competitive performance with established ground motion models (Sreenath et al., 26 Aug 2025). Full decomposability and monotonicity constraints assure that each feature's effect on the final magnitude estimate remains transparent, fostering trust among engineering and decision-making communities.

Machine learning techniques for earthquake magnitude estimation increasingly blend advanced statistical architectures with physically-informed model design, allowing flexible, accurate, and interpretable inference across a range of data modalities. The current frontier involves tackling the challenges of data imbalance, generalization across regions, uncertainty quantification, and interpretability, with ongoing progress toward robust, real-time monitoring systems for seismic hazard mitigation.