Intelligent Shanghai Typhoon Model (ISTM)
- ISTM is a unified generative probabilistic forecasting system that employs a two-stage UNet-Diffusion framework to achieve kilometer-scale typhoon downscaling.
- Its architecture combines deterministic UNet regression with a conditional diffusion model to enhance ensemble-based uncertainty quantification and mesoscale feature reconstruction.
- ISTM serves as an efficient AI surrogate for traditional physics-based models, providing over 20× faster simulation times while maintaining high fidelity in track, intensity, and convective precipitation forecasts.
The Intelligent Shanghai Typhoon Model (ISTM) is a unified regional-to-typhoon generative probabilistic forecasting system designed to enable fast, accurate, and physically plausible kilometer-scale downscaling of typhoon forecasts. It integrates a two-stage UNet-Diffusion framework for super-resolution of coarse meteorological inputs and acts as a plug-in emulator of a hybrid ML-physics modeling core, specifically the Shanghai Typhoon Model (SHTM) within the operational AIWP–physics data fusion paradigm. ISTM provides significant efficiency and fidelity improvements over traditional, computationally intensive high-resolution numerical simulations and baseline AI regression models, while supporting ensemble-based uncertainty quantification and directly enabling the co-evolution of AI and physics-based approaches in operational typhoon forecasting (Niu et al., 23 Aug 2025, Niu, 1 Mar 2025).
1. Model Architecture and Generative Framework
ISTM is structured as a two-stage super-resolution emulator:
- Stage 1: Deterministic UNet Regression
- Inputs are coarse-grained atmospheric fields (e.g., 0.25° ERA5 reanalysis), upsampled to a 0.1° target grid.
- The architecture is a 6-level symmetric encoder–decoder UNet with ResNet-based blocks, skip-connections, and progressive strided downsampling/upsampling.
- The network outputs an initial estimate of the high-resolution (9 km-interpolated) target field, , optimized under a mean squared error (MSE) loss:
Stage 2: Conditional Diffusion Model (CDM) on the Residual
- The residual is modeled via a conditional diffusion process.
- The noising process is defined as , with a cosine noise schedule .
- The reverse denoising model is a 4-stage UNet incorporating spatial self-attention at the bottleneck and FiLM-modulated temporal embeddings generated by a sinusoidal positional encoder followed by an MLP.
- The loss adopts the "pred_v" parameterization:
where , with . - Inference proceeds by sampling and iteratively applying the denoiser to reconstruct , yielding .
This design allows ISTM to efficiently approximate the conditional distribution , providing quantifiable uncertainty through ensemble sampling.
2. Data Mapping, Training Regimen, and Fine-Tuning
ISTM learns a mapping from large-scale, low-resolution meteorological analysis (ERA5 at 25 km) or AIWP model outputs (AIFS) to high-resolution SHTM reanalysis targets on a 9 km grid:
Input Variables: 2 m temperature, 10 m wind, mean sea level pressure (MSLP), total column water vapor (all surface fields), plus geopotential, temperature, winds at 850 hPa and 500 hPa.
Input Tensor: 13 channels, spatially upsampled via interpolation to (521 × 721) grid points at 0.1° spacing.
Target Variables: High-resolution 2 m temperature, 10 m wind, MSLP, and maximum radar reflectivity.
Training Dataset: Sourced from SHTM hybrid reanalysis over the western North Pacific from 2021–2024 (6-hourly). September 2024 is reserved for independent test validation.
Optimization: AdamW (), batch size = 1 per GPU, 200 epochs on 8 NVIDIA A100s, AMP and gradient clipping at 0.5.
Fine-Tuning: Additional training on (AIFS, SHTM) forecast pairs (June 2025) for one month. No explicit domain adaptation, only continued minimization of composite MSE and diffusion objectives.
No explicit physical-constraint losses are applied. Incorporation of physical constraints is noted as a prospective enhancement (Niu et al., 23 Aug 2025).
3. Probabilistic Forecasting and Downscaling Capability
ISTM functions as a probabilistic generative emulator, producing ensembles of high-resolution fields:
Ensemble Sampling: Multiple draws of yield ensemble members , encapsulating model uncertainty, particularly in meso- and convective-scale features.
Downscaling Performance: ISTM accurately reconstructs near-surface winds and extreme reflectivity structures from coarse input, including terrain-induced gusts and eyewall organization, that are systematically underestimated by both ERA5 and baseline deterministic regression models.
Metrics: Maximum 10 m wind (Typhoon Yagi, 6 Sep 2024): ERA5 (25.5 m/s), HiRes (43.5 m/s), baseline UNet (38 m/s), ISTM (42 m/s). PDF envelope of radar reflectivity (1–15 Sep 2024): ISTM ensemble recovers full 20–45 dBZ range; deterministic regression fails above 20 dBZ. Threat score (TS) for extreme precipitation and reflectivity thresholds consistently higher for ISTM (Niu et al., 23 Aug 2025).
| System | Max Wind (m/s, Yagi) | Reflectivity TS (≥30 dBZ) | Compute Time (120 h) |
|---|---|---|---|
| ERA5 | 25.5 | Low | N/A |
| UNet | 38 | Underestimates | 3 min (A100) |
| ISTM (UNet-Diff) | 42 | High, matches HiRes | 3 min (A100) |
| HiRes (SHTM) | 43.5 | Highest | ~66 min (2240 CPU) |
4. Operational Integration: AI–Physics Emulation
ISTM is directly integrated as an AI surrogate for high-resolution, physics–ML hybrid forecasting (SHTM or FuXi–SHTM):
Plug-in Emulator: ISTM maps AIFS (or FuXi) large-scale forecasts to SHTM-like high-res fields in minutes, obviating the need to re-run WRF-based SHTM with spectral nudging for each forecast cycle.
Preserved Track Accuracy: Track information is inherited from the large-scale AIWP model (AIFS or FuXi), while intensity and mesoscale structure are restored to the physical realism of SHTM.
Operational Speed: ISTM >20× faster than direct SHTM simulation for 120 h lead time; ISTM: 3 min on 1×A100 (50 denoising steps); SHTM: 66 min on 2240-core CPU cluster (Niu et al., 23 Aug 2025).
Forecast Quality: ISTM ensemble mean matches or exceeds SHTM for track, intensity, and convective precipitation skill, with statistically meaningful reduction in median intensity error.
5. Relationship to ML–Physical Fusion and CNOP Assimilation Paradigms
ISTM extends the ML–physical hybridization strategies developed in FuXi-SHTM and subsequent studies:
- Dual Physics–Data-Driven Framework: FuXi generates large-scale fields; SHTM is nudged at synoptic scales, retaining explicit simulation of mesoscale phenomena. Spectral nudging is parameterized as
where is a relaxation parameter, and denotes a large-scale spectral filter (Niu, 1 Mar 2025).
CNOP-Guided Targeted Data Assimilation: Conditional Nonlinear Optimal Perturbation (CNOP) identifies sensitive regions for dense satellite assimilation, further improving track and intensity predictions.
Evaluation Results:
- Yagi (2024): 72 h track error SHTM ≈130 km → FuXi-SHTM ≈90 km. 72 h intensity error SHTM ≈10 m/s → FuXi-SHTM ≈8.5 m/s.
- Krathon (2024): 66 h track error SHTM ≈150 km → FuXi-SHTM ≈105 km (Niu, 1 Mar 2025).
- ISTM’s Role: As an emulator, ISTM provides a unified ML–physics surrogate, enabling efficient, ensemble-based, physically informed real-time typhoon forecasting that adapts to evolving advances in both AIWP and physics-based NWP domains.
6. Future Prospects and Methodological Extensions
Key avenues for future ISTM development include:
- Incorporation of explicit physical constraint losses (e.g., mass/energy conservation) within the generative architecture.
- Tighter end-to-end coupling with differentiable ML–physics hybrids, potentially allowing gradients to flow between AI and NWP submodels.
- Expansion of spectral nudging to include humidity and surface fields with adaptive scale weights.
- Higher vertical and horizontal resolution in the fusion models, including boundary-layer representation.
- Advanced ensemble assimilation, such as hybrid 4DEnVar, and dynamically re-targeted observation leveraging real-time CNOP analysis.
- Real-time operational pipeline automation, reducing full forecast-assimilation turnaround to below 2 hours (Niu, 1 Mar 2025).
- A plausible implication is that ISTM may serve as a blueprint for generative emulation in other regional extremes, by unifying AIWP inference speed with physically grounded ensemble fidelity.
References
- "Intelligent Shanghai Typhoon Model (ISTM): A generative probabilistic emulator for typhoon hybrid modeling" (Niu et al., 23 Aug 2025)
- "ML-Physical Fusion Models Are Accelerating the Paradigm Shift in Operational Typhoon Forecasting" (Niu, 1 Mar 2025)