Omni-Weather: Unified Weather Systems
- Omni-Weather is a unified research program that handles diverse weather phenomena by integrating visual recognition, controllable synthesis, forecasting, and language reasoning.
- It employs multimodal data fusion and modular architectures to enhance performance in applications like autonomous driving, urban digital twins, and meteorological reporting.
- Unified training approaches coupling generation and understanding improve results while addressing challenges in realistic, all-weather scene simulation and forecasting.
Omni-Weather denotes an emerging research program aimed at handling weather comprehensively rather than as a single narrowly defined task. In current usage, the term spans at least four related ambitions: visual recognition of weather and its scene-level effects; controllable synthesis of weather in images, videos, and reconstructed 3D scenes; direct or hybrid weather forecasting across spatial and temporal scales; and multimodal reasoning systems that explain, report, or interact with meteorological data in natural language. In autonomous driving and surveillance, it refers to recognizing common weather conditions together with their impacts on visibility, lighting, ground state, glare, spray, and sensor occlusion (Ouattara et al., 15 Apr 2026). In graphics and simulation, it refers to unified engines for fog, haze, smog, rain, snow, and accumulation over consistent 3D scenes (Sang et al., 26 May 2025). In AI weather science, it points to systems that assimilate heterogeneous observations, combine diverse forecast experts, or generate long-horizon probabilistic trajectories (Sun et al., 2024, Nguyen et al., 20 Oct 2025). The explicit model titled “Omni-Weather” extends the term further by unifying radar generation and radar understanding within a single multimodal foundation model (Zhou et al., 25 Dec 2025).
1. Conceptual scope and representative formulations
Recent literature does not use Omni-Weather as a single standardized benchmark or architecture. Instead, it functions as an umbrella expression for unified weather capability. The shared idea is breadth: one system, or one tightly coupled family of systems, should handle many weather conditions, many modalities, or many downstream objectives without being confined to a single phenomenon such as rain classification, radar nowcasting, or fog rendering.
| Research strand | Representative systems | Typical objective |
|---|---|---|
| Perception | RTM/PMG, WeatherPrompt | Recognize weather and scene effects from images |
| Synthesis | Weather-Magician, geometry-grounded weather video synthesis | Render controllable weather in scenes or videos |
| Forecasting | OMG-HD, FuXi Weather, MoWE, UniExtreme, OmniCast | Predict weather fields from observations or expert models |
| Reasoning | WeatherQA, Zephyrus, WeatherSyn, Omni-Weather | Explain, query, or report weather in language |
| Sensing infrastructure | 6G ISAC weather estimation | Reuse communication systems for weather sensing |
A recurrent theme is multimodality. Some works fuse RGB images with text, some combine radar and satellite, some ingest observations from stations, radar, and satellites, and some combine numerical experts or executable tools. Another recurrent theme is modularity. RTM/PMG expose independent heads for 12 weather-related tasks (Ouattara et al., 15 Apr 2026), MoWE learns spatially varying weights over multiple forecast experts (Chakraborty et al., 10 Sep 2025), and Zephyrus wraps forecasting, climate simulation, geoquerying, and data access into an agentic environment (Varambally et al., 5 Oct 2025). This suggests that Omni-Weather is less a single architecture class than a systems-level design principle.
2. All-weather perception and scene understanding
In vision, Omni-Weather typically means robust recognition across day, night, rain, snow, fog, glare, wet ground, and related scene effects. “Heuristic Style Transfer for Real-Time, Efficient Weather Attribute Detection” formulates this as a multi-task, effectively multi-label problem on a dataset of 503,875 RGB images annotated for 12 weather-related attributes, yielding 53 weather classes across tasks (Ouattara et al., 15 Apr 2026). Its RTM and PMG families treat weather as variation in visual style, using truncated ResNet-50 or PatchGAN-style trunks, attention mechanisms, and Gram statistics. The central style descriptor is the Gram matrix,
with local Gram variants used to preserve coarse spatial structure. On the internal 25k-image test set, mean F1 reaches 0.9898 for RTM with attention and 0.9845 for PMG; zero-shot weather-type F1 on external datasets remains above 0.77 and reaches 0.9800 on the Kaggle dataset, while PM and PMG run at 30.3 and 25.1 fps on Raspberry Pi 5, respectively (Ouattara et al., 15 Apr 2026).
This perception line emphasizes that all-weather recognition is not exhausted by coarse labels such as sunny, rainy, or snowy. The full taxonomy in (Ouattara et al., 15 Apr 2026) includes weather type, weather intensity, visibility, sky condition, precipitation presence and intensity, ground condition, glare/reflections, light conditions, road spray, water on windshield, and snow on windshield. A plausible implication is that Omni-Weather perception is best understood as structured scene diagnosis rather than flat scene tagging.
A related but distinct strand addresses weather-invariant representation learning. “WeatherPrompt” targets drone visual geo-localization under synthetic and composite weather by generating open-set textual weather descriptions with Qwen2.5-VL and fusing text and image embeddings through a dynamic gating mechanism (Wen et al., 13 Aug 2025). Rather than relying on a closed set of weather labels, it uses descriptions such as light fog, dense fog, moderate rain, or night with light fog, then aligns same-scene images across weather conditions with image-text contrastive learning, image-text matching, localized alignment, and text-driven feature gating. On University-1652, mean Drone-to-Satellite Recall@1 reaches 77.14%, with particularly large gains under dark, fog, and snow; under real Dark+Rain+Fog videos not seen in training, Drone-to-Satellite Recall@1 rises to 44.44% and Satellite-to-Drone Recall@1 to 66.66% (Wen et al., 13 Aug 2025).
These systems correct two common misconceptions. First, all-weather perception is not simply ordinary recognition plus heavier augmentation; several papers argue that weather alters style statistics, local texture, visibility, and semantics in structured ways (Ouattara et al., 15 Apr 2026, Wen et al., 13 Aug 2025). Second, generic multimodal models remain weak at severe-weather reasoning. WeatherQA shows a substantial gap between state-of-the-art VLMs and human meteorologists on multimodal severe-weather QA based on radar, ingredients fields, and forecast text (Ma et al., 2024).
3. Weather synthesis and controllable scene generation
In synthesis, Omni-Weather refers to engines that can add diverse, controllable weather to scenes while preserving geometry, identity, and motion. “Weather-Magician” is framed as a first concrete step toward an Omni-Weather engine built on 3D Gaussian Splatting (Sang et al., 26 May 2025). It reconstructs real scenes in clear weather and then applies modular weather layers: depth-aware exponential attenuation for fog, haze, and smog; Gaussian particle fields for rain and snowfall; and normal-guided Gaussian placement for snow accumulation. The static fog model follows
and the system supports continuous control over intensity, color, density, particle size, opacity, speed, direction, and accumulation. On RTX 4090 at 1200×800, it reports 83.30 fps for clear rendering, 31.27 fps for fog/haze/smog, 10.42 fps for rain/snowfall, and 58.24 fps for snow cover (Sang et al., 26 May 2025).
A second synthesis line uses general-purpose video editors but supplies stronger control signals than text alone. “Semantic-Aware, Physics-Informed, Geometry-Grounded Weather Video Synthesis” factorizes synthesis into semantic appearance anchoring, physics-informed particle dynamics, and geometry-grounded projection and occlusion (Qian et al., 27 Jun 2026). Particle motion obeys
with turbulence derived from a divergence-free curl-noise field. Geometry from Depth Anything V3 is used to align gravity direction, project particles, and enforce depth-consistent compositing before steering Wan2.1-VACE. Human evaluation reports photo-realism 4.16 and physical realism 4.12 on a 5-point scale, above CogVideoX, WeatherEdit, LTX-Video, and VACE, and the synthesized data improves adverse-weather semantic segmentation on ACDC and MUSES by between 5 and 15 mIoU depending on model and condition (Qian et al., 27 Jun 2026).
The explicit “Omni-Weather” model in radar generation also belongs in this synthesis tradition, but with meteorological fields rather than RGB video. It jointly handles radar nowcasting and radar inversion from satellite IR while also producing textual understanding outputs in a single architecture (Zhou et al., 25 Dec 2025). That model demonstrates that generation and understanding can be coupled rather than treated as separate pipelines.
4. Forecasting, data assimilation, and multi-horizon modeling
In atmospheric forecasting, Omni-Weather usually means breadth across modalities, lead times, or hazard regimes. One route is direct forecasting from observations. OMG-HD is an end-to-end AI weather prediction system trained on observational inputs alone—surface stations, radar, satellite, time encodings, and topography—over CONUS at roughly 0.05° resolution (Zhao et al., 2024). Its architecture separates a Swin Transformer V2 “Assimilating Block” from an AFNO-based “Forecasting Block,” producing hourly forecasts to 12 hours. Against HRRR, it reports up to 13% RMSE improvement for 2-meter temperature, 17% for 10-meter wind speed, 48% for 2-meter specific humidity, and 32% for surface pressure, while also outperforming IFS-HRES and GFS over the same short-range horizon (Zhao et al., 2024).
A global variant of the same ambition appears in FuXi Weather, which integrates learned data assimilation from microwave radiances and GNSS-RO with a learned forecast model on a 0.25° global grid (Sun et al., 2024). It is described as the first system to achieve all-grid, all-surface, all-channel, and all-sky data assimilation and forecasting, producing 10-day forecasts in a 6-hourly cycle and extending skillful lead time beyond ECMWF HRES for several upper-air variables, especially in observation-sparse regions (Sun et al., 2024). A plausible implication is that Omni-Weather forecasting increasingly denotes end-to-end pipelines from raw observations to forecast products, rather than reanalysis-only emulation.
Another route is expert orchestration rather than monolithic replacement. MoWE, the “Mixture of Weather Experts,” learns a Vision Transformer-based gating network that combines pre-trained expert models such as Pangu, Aurora, and FCN3 through
Weights are spatially varying and conditioned on lead time; at a 2-day horizon MoWE achieves up to 10% lower RMSE than the best individual expert and consistently outperforms a simple average across experts (Chakraborty et al., 10 Sep 2025). This formulation treats Omni-Weather as a coordination problem over heterogeneous models.
The same broadening is visible in longer-range probabilistic modeling. OmniCast uses a VAE plus a masked latent diffusion transformer to model the joint distribution of future latent states, rather than rolling out a short-step predictor autoregressively (Nguyen et al., 20 Oct 2025). It trains on full 44-day future sequences for subseasonal forecasting, masks random future tokens, and iteratively unmasks them during inference. OmniCast is competitive in medium range while being 10× to 20× faster than leading probabilistic methods, reaches state-of-the-art performance at the subseasonal-to-seasonal scale across accuracy, physics-based, and probabilistic metrics, and produces stable rollouts up to 100 years ahead (Nguyen et al., 20 Oct 2025).
A different emphasis appears in UniExtreme, which argues that extreme events are marked by spectral disparity relative to normal weather and by hierarchical, geographically blended event structure (Ni et al., 2 Aug 2025). Its Adaptive Frequency Modulation and Event Prior Augmentation modules improve both general and extreme forecasting over 18 event types, reducing normalized extreme-region MAE relative to GraphCast and narrowing the gap between average and extreme performance (Ni et al., 2 Aug 2025). This directly addresses a common misconception: strong average weather skill does not imply strong extreme-weather skill.
5. Language-grounded reasoning, reporting, and unified generation-understanding
Omni-Weather also refers to the ability to explain, query, and communicate weather in language while remaining grounded in numerical or visual meteorological data. WeatherQA formalizes this need for severe convective weather by pairing mesoscale analysis imagery, radar reflectivity, and expert text from Mesoscale Discussions, then evaluating VLMs on affected-area QA and convection-potential classification (Ma et al., 2024). The reported gap between GPT-4o and meteorologists indicates that multimodal language competence in weather remains substantially below expert reasoning.
Zephyrus extends this line by introducing an agentic framework in which an LLM writes and executes Python code against WeatherBench 2 data, a geoquerying tool, a forecasting model, and a climate simulator (Varambally et al., 5 Oct 2025). Zephyrus agents outperform text-only baselines by up to 35 percentage points in correctness on tasks ranging from lookups to forecasting, extreme-event detection, and counterfactual reasoning, but remain near text-only performance on the hardest tasks, indicating that tool access alone does not solve weather reasoning (Varambally et al., 5 Oct 2025).
WeatherSyn focuses on forecast communication. It defines the Weather Forecasting Report task and the WSInstruct dataset over 31 U.S. cities and 8 weather aspects, then instruction-tunes a Qwen3-VL-8B derivative to generate structured multi-day forecast synopses from ERA5-derived heatmaps and aspect-conditioned prompts (Zheng et al., 8 May 2026). WeatherSyn-DPO reaches BLEU-1 0.44, ROUGE-L 0.32, METEOR 0.25, and weighted claim F1 0.59, outperforming leading closed-source MLLMs particularly on structurally complex aspects such as pressure systems, wave patterns, and wind-flow systems (Zheng et al., 8 May 2026). This shows that weather-specific multimodal tuning can surpass much larger general-purpose models on meteorological report generation.
The model explicitly titled “Omni-Weather” goes further by unifying weather generation and understanding in a single multimodal foundation model (Zhou et al., 25 Dec 2025). It uses a shared transformer backbone with modality-specific encoders and decoders to perform radar nowcasting, radar inversion, radar image understanding, and radar sequence understanding, and it supplements generation with a Chain-of-Thought dataset for causal weather reasoning. Joint training improves both generation and understanding relative to generation-only or understanding-only tuning, and CoT fine-tuning plus thinking inference increases perceptual quality and explanation quality, even when some strict pixel metrics trade off slightly (Zhou et al., 25 Dec 2025). In this narrower, explicit sense, Omni-Weather names a concrete architecture that fuses physical-field generation with natural-language interpretation.
6. Infrastructure, deployment, and operational uses
Several works ground Omni-Weather in deployment constraints rather than only benchmark performance. The weather-attribute detector of (Ouattara et al., 15 Apr 2026) is designed for embedded use; PM and PMG run at real-time or near-real-time rates on Raspberry Pi 5, and the task heads can be enabled or disabled at inference to trade coverage for throughput. Weather-Magician emphasizes real-time rendering on commodity GPU hardware and direct control interfaces suitable for urban digital twins, VR/AR, games, synthetic films, and weather-aware testing (Sang et al., 26 May 2025). Zephyrus is explicitly built around a code-execution server and tool pools, pointing toward interactive scientific workflows rather than standalone model deployment (Varambally et al., 5 Oct 2025).
A distinct infrastructural interpretation appears in “Weather Estimation for Integrated Sensing and Communication” (Palhares et al., 21 Jan 2026). There, Omni-Weather is approached through sensing coverage: dense 6G ISAC base stations are repurposed as local weather sensors. The system derives CSI, clutter-filtered CSI, and range–Doppler periodograms from 27.6 GHz OFDM transmissions, then uses CNNs to classify and regress precipitation rate and wind speed. In a multi-week proof-of-concept, it reports 99.38% and 98.99% classification accuracy for precipitation rate and wind speed, with regression errors of 1.2 mm/h and 1.5 km/h, respectively (Palhares et al., 21 Jan 2026). This suggests that Omni-Weather may also denote a pervasive sensing fabric, not only a foundation model.
Operationally, the application space is broad but consistent across papers: ADAS and autonomous driving (Ouattara et al., 15 Apr 2026, Qian et al., 27 Jun 2026), urban digital twins and scene editing (Sang et al., 26 May 2025), short-range regional forecasting from observations (Zhao et al., 2024), global data-to-forecast pipelines (Sun et al., 2024), expert-model orchestration (Chakraborty et al., 10 Sep 2025), and natural-language weather assistance or report generation (Varambally et al., 5 Oct 2025, Zheng et al., 8 May 2026). The common requirement is robust performance across weather regimes without rebuilding the entire stack for each condition.
7. Limitations, misconceptions, and open directions
A central limitation is terminological. Omni-Weather is not yet a single agreed-upon task or benchmark. In the current literature it can mean weather-robust visual perception, 3D-consistent scene synthesis, global forecasting from raw observations, expert-model orchestration, weather-aware language reasoning, or unified generation-understanding. This suggests conceptual richness, but it also means reported gains are often not directly comparable across subfields.
Another recurring limitation is incomplete observability. Visual detectors such as RTM and PMG rely on single-frame RGB and explicitly do not use temporal modeling or non-visual sensors (Ouattara et al., 15 Apr 2026). WeatherPrompt relies on synthetic weather augmentations and inherits biases from the LVLM that produces captions (Wen et al., 13 Aug 2025). WeatherQA shows that even strong VLMs remain far from expert-level severe-weather reasoning (Ma et al., 2024). WeatherSyn’s reports are grounded in ERA5-derived heatmaps rather than full operational workflows, and uncertainty communication is not yet explicit (Zheng et al., 8 May 2026).
Synthesis systems face their own constraints. Weather-Magician depends heavily on reconstruction quality and does not yet model wet surfaces, puddles, splashes, or dense multiple scattering (Sang et al., 26 May 2025). The geometry-grounded video editor depends on upstream depth and gravity estimation, and its performance is bounded by the prior of the off-the-shelf video model (Qian et al., 27 Jun 2026). These limitations correct another misconception: realistic all-weather generation is not obtained by text prompting alone, because text prompts are underspecified and generic video editors often suppress heavy weather phenomena (Qian et al., 27 Jun 2026).
Forecasting systems reveal a different set of trade-offs. Observation-driven models such as OMG-HD achieve strong short-range regional skill but remain geographically limited and surface-focused (Zhao et al., 2024). FuXi Weather demonstrates end-to-end all-sky forecasting, yet still uses a restricted observing system compared with full operational NWP (Sun et al., 2024). MoWE shows that learned expert fusion can exceed any individual expert at 2 days, but robustness to missing experts or strong distribution shift remains open (Chakraborty et al., 10 Sep 2025). OmniCast reduces autoregressive error accumulation and scales to subseasonal horizons, but its VAE bottleneck and lack of explicit physical constraints remain acknowledged limits (Nguyen et al., 20 Oct 2025). UniExtreme shows that extreme-weather capability requires dedicated modeling choices rather than being a free byproduct of good average forecast skill (Ni et al., 2 Aug 2025).
The most plausible research direction is therefore not a single universal model trained once for every weather task, but a layered Omni-Weather stack: unified sensing, modular perception, controllable synthesis, multi-horizon forecasting, and language-grounded explanation, with shared representations wherever transfer is beneficial. The literature already contains partial realizations of each layer, and the strongest recent claim is that generation and understanding can improve one another when trained together in the weather domain (Zhou et al., 25 Dec 2025).