- The paper introduces a meteorology-informed deep learning framework that fuses diverse environmental inputs to enhance convective radar nowcasting.
- It employs multi-scale attention and a dynamic loss function to better capture severe echo intensities and spatial details compared to previous models.
- Experimental results confirm superior skill in severe event detection and forecast persistence up to 120 minutes, with improved trade-offs between detection and false alarms.
IMPA-Net: Meteorology-Aware Multi-Scale Attention and Dynamic Loss for Deterministic Radar Nowcasting
Introduction
The IMPA-Net framework targets radar-based deterministic nowcasting of convective precipitation, specifically addressing the persistent issue of deep learning models underpredicting high-intensity echoes due to imbalanced training distributions and pixel-wise error objectives. Existing methods often struggle to preserve severe convective cores and spatial structure at increasing lead times. IMPA-Net's contribution lies in a unified meteorology-informed design across input representation, architectural, and objective-function layers for 0–2 hour nowcasting. The framework integrates a parameter-free Spatial Mixer for local input fusion, a multi-scale attention-based IMPA module for spatiotemporal feature translation, and a Meteorologically-Aware Dynamic Loss (MAD-Loss) with multi-level asymmetric weighting.
Methodological Framework
IMPA-Net processes four heterogeneous input fields: radar reflectivity, surface precipitation rate, static topographic elevation, and climatological along-slope wind at 850 hPa. Rather than naive channel concatenation, the parameter-free Spatial Mixer deterministically reorganizes these fields within non-overlapping 2×2 neighborhoods, mixing per-pixel information across all sources at the mesoscale-γ (≈2 km) level. This acts as a structured, cross-field prior that enables the encoder to directly access local terrain-precipitation relationships and auxiliary modulating influences during early processing stages.
IMPA Module: Multi-Scale Predictive Attention
The IMPA module serves as the model's spatiotemporal translator. Temporally-encoded latent features from the encoder are processed via:
- Multi-scale depthwise convolutions (kernels 3×3, 5×5, 7×7) to capture joint spatial-temporal dependencies over a spectrum of meteorologically relevant scales.
- Global self-attention, providing long-range spatial relation modeling, allowing the network to learn context-aware nonlocal dependencies.
- Learnable channel-wise intensity calibration, where channel scaling parameters are jointly optimized with the loss function, specifically to counteract the common tendency of deep models to diminish extreme values.
- Residual convolutional detail recovery, ensuring high-resolution structural details persist through to the output.
This design enables effective aggregation of both storm-scale and mesoscale features, supporting improved severe event skill and spatial variability retention.
Meteorologically-Aware Dynamic Loss (MAD-Loss)
MAD-Loss embodies a composite formulation:
- A per-frame asymmetric "extreme" loss with higher penalties for underprediction of severe events, dynamically intensified for later forecast frames and high-fraction severe cases.
- Three sequence-level objectives: structural similarity (SSIM), spatial gradient preservation, and temporal consistency, each with epoch-adaptive weighting.
- All component weights evolve by pre-defined sigmoid schedules across training epochs, while additional temporal and storm-aware weighting adaptively emphasize difficult instances.
This multi-level mechanism creates directed optimization pressure to counteract lead-time–dependent skill loss, extreme-event underestimation, and spatial/structural smoothing.
Experimental Evaluation
Dataset and Baselines
The evaluation domain is Jiangsu Province, China, with S-band radar composites and auxiliary environmental fields covering convective seasons from 2019–2021. The input comprises sequences of 20 historical frames (two hours), with predictions extending 20 frames (six minutes to two hours) into the future. Seven diverse baseline models are compared across encoder-decoder, recurrent, convolutional, attention-based, and extrapolation classes, all using identical input and training setups, except for pySTEPS which is a non-learned extrapolator.
Numerical Results
- At the severe-event threshold (≥45 dBZ), IMPA-Net achieves Heidke Skill Score (HSS) of 0.143 (vs. 0.049 for SimVP) and CSI of 0.084, notably higher than any deep learning baseline under identical settings.
- At moderate intensity (≥35 dBZ), IMPA-Net delivers highest CSI (0.277) and POD (0.379).
- Skill retention with increasing lead time is markedly improved: where other deep learning models exhibit near-zero CSI/POD for severe events after 48–72 min, IMPA-Net maintains nontrivial skill through 120 min, with the slowest rate of decay among all tested models.
- The false alarm ratio at high thresholds is lower than for pySTEPS, yielding a more advantageous trade-off between detection and false alarms.
Spectral and Spatial Structure
Radially averaged power spectral density analysis shows that IMPA-Net preserves mesoscale (20–200 km) and storm-core (2–20 km) power more reliably than deep learning baselines at long forecast horizons. Ablation studies demonstrate that these gains result from the synergistic effect of input mixing, multi-scale attention, and adaptive loss, with MAD-Loss and the IMPA module contributing the largest HSS/POD improvements in isolation.
Qualitative and Process-Level Assessment
Case studies demonstrate that IMPA-Net replicates the spatial reorganization and lifecycle evolution of mesoscale convective bands, retaining both macroscopic translation and microscopic intensity redistribution. Attention attribution analysis confirms that non-local upstream–downstream dependencies, relevant for severe event redevelopment, are identified and exploited by the model.
Implications and Forward-Looking Discussion
IMPA-Net establishes that structured, meteorology-informed model design can overcome limitations endemic to generic deep learning and extrapolation approaches for radar nowcasting. The coordinated fusion of environmental priors, multi-scale dynamical feature learning, and asymmetric loss shaping direct the model toward improved severe-event detection with controlled smoothing, especially critical at extended lead times.
There are operational trade-offs: superior skill at high reflectivity thresholds is accompanied by increased event persistence and some false alarms—optimal in the context of safety-critical severe weather nowcasts but requiring calibration for regional risk tolerance. The choice of static environmental priors is a pragmatic balance between physical realism and real-time feasibility but limits explicit mechanistic interpretability. The model's generalizability to domains with more complex terrain and evolving meteorological forcings remains to be assessed; the framework itself readily admits time-dependent auxiliary variables and dynamic environmental reanalysis sources, representing a natural avenue for future extension.
Additionally, the definition of dynamic loss scheduling is currently hand-designed rather than learned—a domain for meta-learning or automated loss parameterization research.
Conclusion
The IMPA-Net framework demonstrates empirically and quantitatively that integrating meteorologically aware input reorganization, multi-scale attention-based translation, and adaptive asymmetric loss prioritization advances the state-of-the-art in deterministic convective radar nowcasting. The model improves severe echo detection and spatial structure retention without substantially sacrificing general forecast skill or overamplifying false alarms. The approach offers a blueprint for leveraging structured domain knowledge at all levels of deep learning model design in geophysical sequence prediction, while highlighting the necessity for further work in cross-domain validation and integration of dynamically evolving environmental drivers.