Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ClimateLearn: Benchmarking Machine Learning for Weather and Climate Modeling (2307.01909v1)

Published 4 Jul 2023 in cs.LG and cs.AI

Abstract: Modeling weather and climate is an essential endeavor to understand the near- and long-term impacts of climate change, as well as inform technology and policymaking for adaptation and mitigation efforts. In recent years, there has been a surging interest in applying data-driven methods based on machine learning for solving core problems such as weather forecasting and climate downscaling. Despite promising results, much of this progress has been impaired due to the lack of large-scale, open-source efforts for reproducibility, resulting in the use of inconsistent or underspecified datasets, training setups, and evaluations by both domain scientists and artificial intelligence researchers. We introduce ClimateLearn, an open-source PyTorch library that vastly simplifies the training and evaluation of machine learning models for data-driven climate science. ClimateLearn consists of holistic pipelines for dataset processing (e.g., ERA5, CMIP6, PRISM), implementation of state-of-the-art deep learning models (e.g., Transformers, ResNets), and quantitative and qualitative evaluation for standard weather and climate modeling tasks. We supplement these functionalities with extensive documentation, contribution guides, and quickstart tutorials to expand access and promote community growth. We have also performed comprehensive forecasting and downscaling experiments to showcase the capabilities and key features of our library. To our knowledge, ClimateLearn is the first large-scale, open-source effort for bridging research in weather and climate modeling with modern machine learning systems. Our library is available publicly at https://github.com/aditya-grover/climate-learn.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Tung Nguyen (58 papers)
  2. Jason Jewik (1 paper)
  3. Hritik Bansal (38 papers)
  4. Prakhar Sharma (7 papers)
  5. Aditya Grover (82 papers)
Citations (19)

Summary

This paper introduces ClimateLearn, an open-source PyTorch library designed to standardize and simplify the application of ML to weather and climate modeling tasks (Nguyen et al., 2023 ). The motivation stems from the lack of consistent benchmarks, open-source model implementations, and transparent evaluation protocols in the field, which hinders reproducible research and fair comparison of different ML approaches. ClimateLearn aims to bridge the gap between climate science domain expertise and ML implementation best practices.

Key Components of ClimateLearn:

The library is structured around four main components:

  1. Tasks: Defines standard problem formulations for common climate modeling challenges:
    • Weather Forecasting: Predicting future weather states (e.g., temperature, geopotential height) at a specific lead time, given current and potentially past states. Inputs and outputs are typically gridded data (C×H×WC \times H \times W).
    • Downscaling: Mapping low-resolution climate model outputs or reanalysis data (C×H×WC \times H \times W) to higher resolutions (C×H×WC' \times H' \times W', where H>H,W>WH'>H, W'>W). This is crucial for regional impact studies.
    • Climate Projection: Predicting long-term climate variable distributions (e.g., annual mean temperature) based on different forcing scenarios (e.g., greenhouse gas emissions). Uses data like ClimateBench.
  2. Datasets: Provides interfaces and pre-processing tools for commonly used climate datasets:
    • ERA5: A widely used global atmospheric reanalysis dataset from ECMWF. ClimateLearn handles downloading, subsetting variables, regridding to lower resolutions (e.g., 5.625°), and normalization.
    • Extreme-ERA5: A subset of ERA5 constructed by the authors to specifically evaluate model performance on extreme temperature events (heat waves, cold spells), defined using percentile thresholds on localized 7-day means.
    • CMIP6: Data from climate model simulations (e.g., MPI-ESM1.2-HR, Norwegian Earth System Model via ClimateBench). Used for forecasting benchmarks and climate projection.
    • PRISM: High-resolution observational data for temperature and precipitation over the conterminous US, used for downscaling benchmarks.
    • ClimateLearn provides utilities for fast data loading into PyTorch-compatible formats, handling common file types like NetCDF.
  3. Models: Implements both traditional baselines and state-of-the-art deep learning architectures:
    • Traditional Baselines: Climatology (historical average), Persistence (last observation), Linear Regression (for forecasting); Nearest Neighbor and Bilinear Interpolation (for downscaling).
    • Deep Learning Models: Standard implementations of architectures suitable for gridded data, including Residual Networks (ResNet), U-Nets, and Vision Transformers (ViT). The library allows easy loading of benchmark models from literature and customization.
  4. Evaluations: Offers standardized metrics and visualization tools:
    • Metrics:
      • Forecasting (Deterministic): Latitude-weighted Root Mean Square Error (RMSE), Latitude-weighted Anomaly Correlation Coefficient (ACC).
      • Forecasting (Probabilistic): Spread-skill ratio, Continuous Ranked Probability Score (CRPS).
      • Downscaling: RMSE, Mean Bias, Pearson's correlation coefficient.
      • Climate Projection: Normalized spatial RMSE (NRMSEs_s), Normalized global RMSE (NRMSEg_g), and a combined total metric.
    • Visualizations: Tools to generate plots like per-pixel mean bias maps for forecasts, rank histograms for probabilistic forecasts, and comparisons between predictions and ground truth.

Benchmark Evaluations:

The paper presents benchmark results using ClimateLearn to demonstrate its capabilities:

  • Weather Forecasting:
    • Compared ResNet, U-Net, ViT against baselines and the physics-based IFS model on ERA5 5.625° data for forecasting Z500, T850, and T2m at lead times up to 10 days.
    • Findings: DL models significantly outperform persistence and climatology but lag behind IFS. ResNet generally performed best among DL models.
    • Compared direct (separate model per lead time), continuous (single model conditioned on lead time), and iterative (rollout short-term forecasts) strategies. Direct forecasting yielded the best overall performance, while iterative forecasting suffered from error accumulation.
    • Evaluated on Extreme-ERA5: DL models showed robust performance, sometimes even slightly better RMSE than on the normal test set for T2m, suggesting the models handle these specific extremes well.
    • Tested data robustness by training on ERA5 and testing on CMIP6, and vice-versa. Models showed reasonable cross-dataset generalization, with performance slightly degrading compared to in-distribution testing. Training on the larger CMIP6 dataset (using more historical data) improved performance on ERA5.
  • Downscaling:
    • Evaluated ResNet, U-Net, ViT against interpolation methods for ERA5-to-ERA5 (5.6° to 2.8°) and ERA5-to-PRISM (2.8° to 0.75°) downscaling.
    • Findings: DL models substantially outperformed interpolation methods in terms of RMSE and Pearson correlation (for ERA5-to-PRISM), although they exhibited some mean bias (tendency to overestimate). U-Net and ResNet were strong performers.
  • Climate Projection (Appendix):
    • Evaluated models on the ClimateBench dataset. U-Net and the original CNN-LSTM baseline from ClimateBench performed best across different variables and metrics.

Practical Implications and Usage:

ClimateLearn provides a practical toolkit for researchers and practitioners:

  • Standardization: Enables consistent comparison of models and techniques.
  • Accessibility: Lowers the barrier to entry for both climate scientists wanting to use ML and ML researchers entering climate science.
  • Reproducibility: Provides open-source code, pre-processing pipelines, and defined evaluation setups.
  • Extensibility: Designed to be modular, allowing users to easily add new datasets, models, tasks, or metrics.
  • Code Examples: The paper appendix includes code snippets demonstrating how to perform tasks like data downloading, setting up data loaders, defining and training models, and generating visualizations within the ClimateLearn framework.

The library is available on GitHub and installable via pip, promoting community adoption and contribution. Future work includes adding more datasets (especially regional), implementing model ensembles for uncertainty quantification, integrating large pretrained models, and supporting physics-informed ML approaches.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com