Papers
Topics
Authors
Recent
Search
2000 character limit reached

IMPA-Net: Meteorology-Aware Multi-Scale Attention and Dynamic Loss for Extreme Convective Radar Nowcasting

Published 27 Apr 2026 in cs.LG | (2604.24224v1)

Abstract: Short-range prediction of convective precipitation from weather radar observations is essential for severe weather warnings. However, deep learning models trained with pixel-wise error metrics tend to produce overly smooth forecasts that suppress intense echoes critical for hazard detection. This issue is exacerbated by insufficient multi-scale feature interaction and suboptimal fusion of heterogeneous geophysical inputs. We propose IMPA-Net (Integrated Multi-scale Predictive Attention Network), a deterministic 0-2 hour nowcasting framework that addresses these limitations through meteorologically-informed designs at the input, architecture, and loss function levels. A parameter-free Spatial Mixer reorganizes heterogeneous input channels at the mesoscale-$γ$ neighborhood (~2 km) via deterministic channel permutation, providing a structured cross-field prior. An integrated multi-scale predictive attention module serves as the spatiotemporal translator, capturing dynamics from mesoscale-$β$ to mesoscale-$γ$ scales. A Meteorologically-Aware Dynamic Loss employs three-level asymmetric weighting -- adapting across training epochs, storm intensity, and forecast lead time -- to counteract regression-to-the-mean. Evaluated against seven baselines on a multi-source radar dataset over eastern China, IMPA-Net raises the Heidke Skill Score at $\geq$45 dBZ from 0.049 (SimVP baseline) to 0.143 under matched settings. Relative to pySTEPS, it provides a better trade-off between severe-event detection and false-alarm control. Spectral analysis confirms preserved energy across mesoscale bands where competing methods show progressive smoothing. These improvements are shown within a single domain and convective regime; generalizability to other orographic and climatic regions remains to be tested.

Summary

  • The paper introduces a meteorology-informed deep learning framework that fuses diverse environmental inputs to enhance convective radar nowcasting.
  • It employs multi-scale attention and a dynamic loss function to better capture severe echo intensities and spatial details compared to previous models.
  • Experimental results confirm superior skill in severe event detection and forecast persistence up to 120 minutes, with improved trade-offs between detection and false alarms.

IMPA-Net: Meteorology-Aware Multi-Scale Attention and Dynamic Loss for Deterministic Radar Nowcasting

Introduction

The IMPA-Net framework targets radar-based deterministic nowcasting of convective precipitation, specifically addressing the persistent issue of deep learning models underpredicting high-intensity echoes due to imbalanced training distributions and pixel-wise error objectives. Existing methods often struggle to preserve severe convective cores and spatial structure at increasing lead times. IMPA-Net's contribution lies in a unified meteorology-informed design across input representation, architectural, and objective-function layers for 0–2 hour nowcasting. The framework integrates a parameter-free Spatial Mixer for local input fusion, a multi-scale attention-based IMPA module for spatiotemporal feature translation, and a Meteorologically-Aware Dynamic Loss (MAD-Loss) with multi-level asymmetric weighting.

Methodological Framework

Input Fusion and Spatial Mixer

IMPA-Net processes four heterogeneous input fields: radar reflectivity, surface precipitation rate, static topographic elevation, and climatological along-slope wind at 850 hPa. Rather than naive channel concatenation, the parameter-free Spatial Mixer deterministically reorganizes these fields within non-overlapping 2×2 neighborhoods, mixing per-pixel information across all sources at the mesoscale-γ (≈2 km) level. This acts as a structured, cross-field prior that enables the encoder to directly access local terrain-precipitation relationships and auxiliary modulating influences during early processing stages.

IMPA Module: Multi-Scale Predictive Attention

The IMPA module serves as the model's spatiotemporal translator. Temporally-encoded latent features from the encoder are processed via:

  • Multi-scale depthwise convolutions (kernels 3×3, 5×5, 7×7) to capture joint spatial-temporal dependencies over a spectrum of meteorologically relevant scales.
  • Global self-attention, providing long-range spatial relation modeling, allowing the network to learn context-aware nonlocal dependencies.
  • Learnable channel-wise intensity calibration, where channel scaling parameters are jointly optimized with the loss function, specifically to counteract the common tendency of deep models to diminish extreme values.
  • Residual convolutional detail recovery, ensuring high-resolution structural details persist through to the output.

This design enables effective aggregation of both storm-scale and mesoscale features, supporting improved severe event skill and spatial variability retention.

Meteorologically-Aware Dynamic Loss (MAD-Loss)

MAD-Loss embodies a composite formulation:

  • A per-frame asymmetric "extreme" loss with higher penalties for underprediction of severe events, dynamically intensified for later forecast frames and high-fraction severe cases.
  • Three sequence-level objectives: structural similarity (SSIM), spatial gradient preservation, and temporal consistency, each with epoch-adaptive weighting.
  • All component weights evolve by pre-defined sigmoid schedules across training epochs, while additional temporal and storm-aware weighting adaptively emphasize difficult instances.

This multi-level mechanism creates directed optimization pressure to counteract lead-time–dependent skill loss, extreme-event underestimation, and spatial/structural smoothing.

Experimental Evaluation

Dataset and Baselines

The evaluation domain is Jiangsu Province, China, with S-band radar composites and auxiliary environmental fields covering convective seasons from 2019–2021. The input comprises sequences of 20 historical frames (two hours), with predictions extending 20 frames (six minutes to two hours) into the future. Seven diverse baseline models are compared across encoder-decoder, recurrent, convolutional, attention-based, and extrapolation classes, all using identical input and training setups, except for pySTEPS which is a non-learned extrapolator.

Numerical Results

  • At the severe-event threshold (≥45 dBZ), IMPA-Net achieves Heidke Skill Score (HSS) of 0.143 (vs. 0.049 for SimVP) and CSI of 0.084, notably higher than any deep learning baseline under identical settings.
  • At moderate intensity (≥35 dBZ), IMPA-Net delivers highest CSI (0.277) and POD (0.379).
  • Skill retention with increasing lead time is markedly improved: where other deep learning models exhibit near-zero CSI/POD for severe events after 48–72 min, IMPA-Net maintains nontrivial skill through 120 min, with the slowest rate of decay among all tested models.
  • The false alarm ratio at high thresholds is lower than for pySTEPS, yielding a more advantageous trade-off between detection and false alarms.

Spectral and Spatial Structure

Radially averaged power spectral density analysis shows that IMPA-Net preserves mesoscale (20–200 km) and storm-core (2–20 km) power more reliably than deep learning baselines at long forecast horizons. Ablation studies demonstrate that these gains result from the synergistic effect of input mixing, multi-scale attention, and adaptive loss, with MAD-Loss and the IMPA module contributing the largest HSS/POD improvements in isolation.

Qualitative and Process-Level Assessment

Case studies demonstrate that IMPA-Net replicates the spatial reorganization and lifecycle evolution of mesoscale convective bands, retaining both macroscopic translation and microscopic intensity redistribution. Attention attribution analysis confirms that non-local upstream–downstream dependencies, relevant for severe event redevelopment, are identified and exploited by the model.

Implications and Forward-Looking Discussion

IMPA-Net establishes that structured, meteorology-informed model design can overcome limitations endemic to generic deep learning and extrapolation approaches for radar nowcasting. The coordinated fusion of environmental priors, multi-scale dynamical feature learning, and asymmetric loss shaping direct the model toward improved severe-event detection with controlled smoothing, especially critical at extended lead times.

There are operational trade-offs: superior skill at high reflectivity thresholds is accompanied by increased event persistence and some false alarms—optimal in the context of safety-critical severe weather nowcasts but requiring calibration for regional risk tolerance. The choice of static environmental priors is a pragmatic balance between physical realism and real-time feasibility but limits explicit mechanistic interpretability. The model's generalizability to domains with more complex terrain and evolving meteorological forcings remains to be assessed; the framework itself readily admits time-dependent auxiliary variables and dynamic environmental reanalysis sources, representing a natural avenue for future extension.

Additionally, the definition of dynamic loss scheduling is currently hand-designed rather than learned—a domain for meta-learning or automated loss parameterization research.

Conclusion

The IMPA-Net framework demonstrates empirically and quantitatively that integrating meteorologically aware input reorganization, multi-scale attention-based translation, and adaptive asymmetric loss prioritization advances the state-of-the-art in deterministic convective radar nowcasting. The model improves severe echo detection and spatial structure retention without substantially sacrificing general forecast skill or overamplifying false alarms. The approach offers a blueprint for leveraging structured domain knowledge at all levels of deep learning model design in geophysical sequence prediction, while highlighting the necessity for further work in cross-domain validation and integration of dynamically evolving environmental drivers.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.