Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ClimaX: A foundation model for weather and climate (2301.10343v5)

Published 24 Jan 2023 in cs.LG and cs.AI

Abstract: Most state-of-the-art approaches for weather and climate modeling are based on physics-informed numerical models of the atmosphere. These approaches aim to model the non-linear dynamics and complex interactions between multiple variables, which are challenging to approximate. Additionally, many such numerical models are computationally intensive, especially when modeling the atmospheric phenomenon at a fine-grained spatial and temporal resolution. Recent data-driven approaches based on machine learning instead aim to directly solve a downstream forecasting or projection task by learning a data-driven functional mapping using deep neural networks. However, these networks are trained using curated and homogeneous climate datasets for specific spatiotemporal tasks, and thus lack the generality of numerical models. We develop and demonstrate ClimaX, a flexible and generalizable deep learning model for weather and climate science that can be trained using heterogeneous datasets spanning different variables, spatio-temporal coverage, and physical groundings. ClimaX extends the Transformer architecture with novel encoding and aggregation blocks that allow effective use of available compute while maintaining general utility. ClimaX is pre-trained with a self-supervised learning objective on climate datasets derived from CMIP6. The pre-trained ClimaX can then be fine-tuned to address a breadth of climate and weather tasks, including those that involve atmospheric variables and spatio-temporal scales unseen during pretraining. Compared to existing data-driven baselines, we show that this generality in ClimaX results in superior performance on benchmarks for weather forecasting and climate projections, even when pretrained at lower resolutions and compute budgets. The source code is available at https://github.com/microsoft/ClimaX.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Tung Nguyen (58 papers)
  2. Johannes Brandstetter (46 papers)
  3. Ashish Kapoor (64 papers)
  4. Jayesh K. Gupta (25 papers)
  5. Aditya Grover (82 papers)
Citations (207)

Summary

ClimaX: A Foundation Model for Weather and Climate

The paper under review introduces ClimaX, a sophisticated deep learning model designed to generalize across various domains within weather and climate science. ClimaX distinguishes itself by extending the Transformer architecture with novel encoding and aggregation components, optimizing computational efficiency without sacrificing versatility. The model is trained on CMIP6-derived climate datasets using a self-supervised objective, highlighting its adaptability to heterogeneous data inputs spanning diverse variables, spatio-temporal ranges, and physical contexts.

Key Contributions

ClimaX's architectural innovations allow it to handle datasets with varying spatial resolutions and input variables. With an extensive pretraining phase involving a randomized forecasting objective, the model acquires a robust foundation, enabling effective fine-tuning across a variety of tasks. These include conventional weather forecasting, climate model downscaling, and projections for atmospheric variables over untrained spatio-temporal scales.

Empirical Validation

ClimaX's capabilities are rigorously tested against existing methods, demonstrating superior performance on established benchmarks such as WeatherBench and ClimateBench. Importantly, ClimaX achieves state-of-the-art results on ClimateBench and maintains competitive performance against the IFS on WeatherBench, even at reduced resolutions and computational budgets. This suggests that ClimaX offers a compelling alternative to traditional numerical methods, combining accuracy with computational efficiency.

Implications and Future Directions

The paper suggests that ClimaX sets a precedent for future exploration into the integration of heterogeneous climate datasets within a singular architectural framework. One tangible application could be the enhancement of extreme weather prediction models and long-term climate impact assessments. Additionally, the scalability of ClimaX in terms of data volume and model capacity underscores its potential as a template for creating diverse, general-purpose models in Earth system sciences.

In terms of future research, exploring the extension of ClimaX's capabilities to incorporate novel datasets—such as those simulating various projected climate scenarios or employing multi-scale architectures—represents a substantive avenue for development. Furthermore, given the practical benefits observed, refinements that further enhance model resolution and adaptive capabilities can yield even broader applications.

Conclusion

ClimaX is a noteworthy advancement in data-driven weather and climate modeling, overcoming the limitations of task-specific models through its innovative design and training strategies. By harnessing the breadth of available data and computational resources, ClimaX positions itself as a versatile tool for a wide array of atmospheric science tasks. Given these strengths, ClimaX could pave the way for more holistic, scalable modeling approaches in the domain, offering enhanced accuracy and efficiency across various practical scenarios.

Github Logo Streamline Icon: https://streamlinehq.com