Papers
Topics
Authors
Recent
2000 character limit reached

Phase Neural Operator (PhaseNO)

Updated 22 December 2025
  • Phase Neural Operator (PhaseNO) is a neural-operator framework that learns mappings between seismic waveforms and phase probabilities across sensor networks.
  • It employs a U-shaped architecture combining Fourier neural operator layers for global temporal convolution with graph neural operator layers for spatial message-passing.
  • Evaluations show PhaseNO achieves higher recall and detection rates than traditional methods, with effective adaptation in microseismic and low-SNR scenarios.

The Phase Neural Operator (PhaseNO) is a neural-operator-based framework designed to learn mappings between function spaces in seismic data analysis, most notably for multi-station phase picking of P- and S-wave arrivals. PhaseNO generalizes deep neural network approaches to the operator regime, enabling simultaneous analysis across entire sensor networks and robust adaptation to new geometries, noise regimes, and task-specific conventions (Sun et al., 2023, Kong et al., 17 Oct 2025, Abdullin et al., 15 Dec 2025).

1. Mathematical Formulation

PhaseNO models phase picking as an operator learning problem. For seismic networks, let ΩR2\Omega \subset \mathbb{R}^2 denote the horizontal geographic domain containing NN stations at locations xix_i, i=1,,Ni=1,\ldots,N, and T=[0,Tmax]T = [0, T_{\max}] the time window. The operator targets the mapping

G:u(x,t)v(x,t)R2\mathcal{G}: u(x, t) \mapsto v(x, t) \in \mathbb{R}^2

where u(xi,t)R3u(x_i, t) \in \mathbb{R}^3 represents the 3-component (Z, N, E) seismogram at station ii, and v(xi,t)=[pP(i,t),pS(i,t)]v(x_i, t) = [p_P(i, t), p_S(i, t)] are the instantaneous sample-level probabilities for P- or S-phase arrivals.

PhaseNO approximates G\mathcal{G} via a composition of:

  • Temporal Fourier Neural Operator (FNO) layers, each performing global convolution in the time domain with low-mode complex-valued learnable kernels (Rl(ξ)R^l(\xi)), local linear channel mixing (WlW^l), and ReLU (or GELU) nonlinearity. The FNO update at each layer is

a+1(x,t)=σ[Wa(x,t)+Fξ1[R(ξ)Ft(a(x,t))]]a^{\ell+1}(x, t) = \sigma \left[ W^\ell a^\ell(x, t) + \mathcal{F}^{-1}_\xi \left[ R^\ell(\xi) \circ \mathcal{F}_t (a^\ell(x, t)) \right] \right]

  • A Graph Neural Operator (GNO) layer acting on the station graph G=(V,E)G=(V,E), with node features hi(t)Rdh_i(t) \in \mathbb{R}^d and edge features based on pairwise station distances, propagates spatial context as

hi(t)=σ[W0hi(t)+jN(i)K(xixj)hj(t)]h'_i(t) = \sigma \left[ W^0 h_i(t) + \sum_{j \in \mathcal{N}(i)} K(\|x_i - x_j\|) \cdot h_j(t) \right]

where K()K(\cdot) is a learnable MLP kernel parameterized by inter-station distance.

A final pointwise MLP head projects hi(t)h'_i(t) to vi(t)v_i(t), assigning P, S, and noise probabilities via softmax normalization. The operator thus leverages global temporal correlations and inter-station coherence for robust phase detection.

2. Network Architecture and Implementation

PhaseNO is structured as a U-shaped neural operator network with alternating temporal and spatial blocks (Sun et al., 2023, Abdullin et al., 15 Dec 2025). Key components:

  • Input Encoding: Each station's waveform (typically 120 s at 100 Hz, i.e., 12,000 samples for seismic, 3,000 for microseismic applications) is lifted to dd channels via 1×1 convolution or linear MLP.
  • FNO Stack: 4–7 layers (seismic: 4, microseismic: 6; up to 7 in some regional benchmarks) perform global convolutions in Fourier (time) domain, typically retaining only the lowest FF Fourier modes, complemented by local channel-mixing linear branches.
  • GNO Layer: A single or several graph message-passing blocks, using fully-connected or kNN graphs, with edge-feature-dependent weights parameterized by small MLPs.
  • Projection Head: Two-layer MLP (channels dd2/3d \to d \to 2/3), softmaxed for multi-class output (P, S, noise).
  • Residual Connections and Skip Concatenation: Each FNO/GNO block is residualized; skip connections concatenate features for U-Net–like context aggregation.
  • Multi-/Single-Station Modes: All weights are shared in multi-station mode; the PhaseNO₁ variant operates station-wise.

Parameter counts are typically in the 2–3 million range, adjustable by application (regional vs. local network).

3. Training Methodology and Loss Functions

Supervised training is on expert-labeled picks, formatted as either one-hot or smoothed triangular probability densities. Loss functions include:

  • Binary/Multiclass Cross-Entropy: For sample-level P/S/noise probability targets,

CE(y^,v)=c{P,S,N}y^clogpc\ell_{CE}(\hat{y}, v) = -\sum_{c \in \{P, S, N\}} \hat{y}_c \log p_c

summed across all stations and times.

  • Regularization: L2L^2 penalty on all parameters, with weight decay λ\lambda.
  • Optimization: Adam or AdamW, standard learning rate schedules and batch sizes. Early stopping and ReduceLROnPlateau strategies are implemented for transfer learning.
  • Label Extraction: Pick times post-inference are the local maxima in pPp_P or pSp_S above threshold, typically 0.5 for seismic, tuned for highest F1.

For microseismic adaptation (MicroPhaseNO), transfer learning fine-tunes all blocks using a small calibration set (e.g., 200 traces), realigning model output to campaign-specific labeling conventions (peak/trough vs. onset picks), and removes systematic timing bias (Abdullin et al., 15 Dec 2025).

4. Data Preprocessing and Input Representation

  • Windowing: Extraction of fixed-duration windows centered on event origin, sampling rates at 100 Hz.
  • Normalization: Channel-wise standardization using global statistics from training set.
  • Station Coordinates: Mapped to [0,1][0,1], used as graph attributes for adjacency computation (but not explicitly concatenated to features).
  • Label Encoding: One-hot for phase classification; smoothed triangle for regression-based variants.
  • Data Augmentation: Stacking multiple events in a window, adding virtual noise stations, random temporal shifts and amplitude scaling, subsampling stations to encourage generalization under varying network geometry.

5. Experimental Evaluation and Comparative Analysis

PhaseNO has been benchmarked against PhaseNet [Zhu et al. 2019] and EQTransformer [Mousavi et al. 2020] on both regional and local seismic networks (Sun et al., 2023, Kong et al., 17 Oct 2025, Abdullin et al., 15 Dec 2025).

Phase Picking Metrics (Local Network Test 1, M<1M<1)

Model Detected Picks P-Precision P-Recall P-f1f_1 S-Precision S-Recall S-f1f_1
PhaseNO 26,026 0.48 0.78 0.59 0.66 0.77 0.71
PhaseNet 12,982 0.63 0.66
EQTransf. 5,560 0.43 0.38

PhaseNO consistently exhibits higher recall and event detection rates, particularly on low SNR events, and recovers up to 2–3× more weak arrivals compared to single-station networks. Association metrics (event-matched within ΔT=1\Delta T=1 s) yield f10.60f_1\sim0.60–$0.79$ for PhaseNO, outperforming PhaseNet on 3 of 4 local datasets.

Performance in Microseismic Adaptation

MicroPhaseNO, after transfer learning with a minimal campaign calibration set, increases F1 and accuracy by 10–30% and reduces pick uncertainty and bias by 3×\sim 3\times over both the original PhaseNO and conventional STA/LTA–AIC workflows (Abdullin et al., 15 Dec 2025).

Model Precision Recall F1 ACC Bias (s) σpick\sigma_{pick} (s)
original 0.75 0.88 0.80 0.82 -0.12 0.45
MicroPhaseNO 0.88 0.94 0.90 0.91 -0.02 0.15
STA/LTA+AIC 0.62 0.78 0.69 0.74 +0.05 0.60

False positive rates are higher but predominantly at SNR << 10 dB; manual review attributes 40–80% of "new" events to real uncataloged seismicity. Adding stations rapidly increases precision and recall, with diminishing returns beyond 4 nodes.

6. Generalizations and Broader Applications

PhaseNO's neural operator paradigm has been extended beyond seismology. In S-matrix phase reconstruction for 222\to 2 scattering, PhaseNO (as an FNO) maps amplitude modulus B(z)B(z) to phase ϕ(z)\phi(z), learning hidden integral constraints (unitarity) from sample-based training without exposure to the governing equations. The discretization invariance property enables evaluation at arbitrary collocation grids ("zero-shot super-resolution") (Niarchos et al., 2024).

A two-headed output enables both regression of sinϕ(z)\sin\phi(z) and fidelity index estimation for physical admissibility, applicable as a joint regression-classification framework. Ensemble averaging sharpens admissible boundaries in parameter space and reduces stochastic uncertainty.

7. Limitations, Future Work, and Implications

PhaseNO's computational cost scales quadratically with number of nodes due to message passing in fully connected graphs. For large arrays, sensor chunking or sparse adjacency constructions are recommended (Sun et al., 2023, Kong et al., 17 Oct 2025). Pointwise uncertainty quantification via standard deviation or ensemble spread is feasible.

Current generalization fails for finite expansions beyond training cutoff (e.g., partial-wave L>3L>3 in a QFT context), and ambiguous-phase recovery remains dataset-dependent. Hybrid schemes integrating physics-informed losses may yield higher fidelity to integral constraints and sharper phase-admissible boundaries.

This suggests the neural operator approach embodied in PhaseNO may have broad utility for inverse problems constraining function-to-function mappings by nonlocal, implicit equations—across seismology, quantum field theory, and other domains. The campaign-adaptive transfer learning strategy indicates rapid deployment potential in microseismic monitoring with minimal calibration effort.


References:

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Phase Neural Operator (PhaseNO).

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube