Data-driven medium-range weather prediction with a Resnet pretrained on climate simulations: A new model for WeatherBench (2008.08626v2)

Published 19 Aug 2020 in physics.ao-ph

Abstract: Numerical weather prediction has traditionally been based on physical models of the atmosphere. Recently, however, the rise of deep learning has created increased interest in purely data-driven medium-range weather forecasting with first studies exploring the feasibility of such an approach. To accelerate progress in this area, the WeatherBench benchmark challenge was defined. Here, we train a deep residual convolutional neural network (Resnet) to predict geopotential, temperature and precipitation at 5.625 degree resolution up to 5 days ahead. To avoid overfitting and improve forecast skill, we pretrain the model using historical climate model output before fine-tuning on reanalysis data. The resulting forecasts outperform previous submissions to WeatherBench and are comparable in skill to a physical baseline at similar resolution. We also analyze how the neural network creates its predictions and find that, with some exceptions, it is compatible with physical reasoning. Finally, we perform scaling experiments to estimate the potential skill of data-driven approaches at higher resolutions.

Citations (10)

View on Semantic Scholar

Summary

The paper introduces a Resnet model for data-driven medium-range weather prediction, leveraging pretraining on climate simulation data and achieving state-of-the-art results on the WeatherBench benchmark.
The model achieves performance comparable to physical models for upper-level variables like Z500 and T850, although precipitation forecasting remains challenging.
Interpretability analysis reveals the model focuses on physically plausible regions, highlighting potential for deep learning in atmospheric dynamics representation and suggesting future AI-physics model integration.

Data-Driven Medium-Range Weather Prediction with a Resnet

The paper explores an innovative approach to medium-range weather forecasting by leveraging a large convolutional neural network (CNN) trained on climate simulation data. The primary focus is to apply deep learning methodologies to meteorology, specifically by participating in the WeatherBench benchmark challenge, which seeks to predict key atmospheric variables over a 5-day forecast period using purely data-driven models.

Model Construction and Training Insights

The authors employ a Resnet architecture, which is a fully convolutional neural network composed of 19 residual blocks. These blocks consist of sequential layers performing convolutional operations, activation functions, normalization, and dropout, with the purpose of capturing complex spatial dependencies in atmospheric data.

A significant methodological contribution of the paper is the utilization of historical climate simulation data for pretraining the network. The climate data, derived from CMIP6 archives using the MPI-ESM-HR model, allows the model to learn generalized atmospheric patterns before fine-tuning on detailed ERA5 reanalysis data. This two-step training protocol effectively prevents overfitting, particularly evident when forecasting longer lead times where atmospheric chaos increases. The attention to training paradigms—direct and continuous models—is also noteworthy. Direct models independently predict each forecast lead time, whereas continuous models handle all lead times simultaneously, permitting temporal variables as part of the input. The continuous approach, in this case, mitigates overfitting due to its data efficiency and the imposition of temporal smoothness.

Performance Metrics and Comparative Analysis

The model forecasts for geopotential height, temperature, and precipitation achieve substantial improvements over previous submissions to the WeatherBench challenge, setting a new benchmark for data-driven methodologies. The paper provides a meticulous evaluation where area-weighted RMSE and ACC metrics are employed to quantify forecast skill across multiple atmospheric layers and surface variables.

RMSE outcomes: The pretrained Resnet demonstrates performance measures comparable to T63-level physical models for Z500 and T850, indicating its competence in predicting the atmospheric state at mid-tropospheric levels.
Precipitation Forecasts: Unlike upper-level atmospheric variables, precipitation prediction skill remains an intricate challenge, with errors persisting when evaluated against reanalysis data, an aspect attributed to the non-linear nature and chaotic behavior of precipitation systems.

The authors also conduct insightful sensitivity analyses involving network resolutions and sizes. The findings underline that finer resolution data enhances predictive skill, suggesting that increased granularity in input data could fortify model efficacy.

Interpretability and Implications

A critical component of the paper is the examination of prediction interpretability through saliency mapping. These saliency analyses reveal that the neural network tends to focus on physically plausible regions, with patterns indicative of known atmospheric dynamics like Rossby wave propagation. However, discrepancies are noted where the model sometimes predicts using geopotential influenced by regions too far to be physically relevant within the 3-day forecast timeframe, hinting at potential overfitting or spurious correlations learned by the network.

Theoretical and Practical Implications

The results have compelling implications for both theoretical research and practical forecasting. On a theoretical front, the paper affirms the viability of deep learning models in representing complex atmospheric dynamics, albeit with caveats regarding data sufficiency and model robustness. Practically, the paper suggests that pretraining on climate model data is a promising strategy for overcoming data limitations commonly faced in high-resolution forecasting tasks.

Future Directions in AI-enhanced Meteorology

A speculative future sees AI augmenting, if not rivaling, traditional numerical weather prediction (NWP) through further integration with physical models—addressing current limitations like training data availability and better handling uncertainty in predictions. The principle of combining model-free AI approaches and physics-informed models could herald a new paradigm in medium to long-range weather forecasting, improving both reliability and accuracy. However, such synergistic advancements will depend heavily on computational capabilities, availability of high-resolution datasets, and interdisciplinary collaborations across atmospheric sciences and machine learning.

PDF Markdown

Related Papers

Find Related Papers

GitHub

GitHub - pangeo-data/WeatherBench: A benchmark dataset for data-driven weather forecasting (767 stars)

Tweets

https://twitter.com/raspstephan/status/1272571910027718656

https://twitter.com/raspstephan/status/1242736967294554115

https://twitter.com/thuereyGroup/status/1379348684756844548

https://twitter.com/raspstephan/status/1296728107534213122

https://twitter.com/_omarjamil_/status/1380152480428920833