STC-ViT: Spatio Temporal Continuous Vision Transformer for Weather Forecasting (2402.17966v3)
Abstract: Operational weather forecasting system relies on computationally expensive physics-based models. Recently, transformer based models have shown remarkable potential in weather forecasting achieving state-of-the-art results. However, transformers are discrete and physics-agnostic models which limit their ability to learn the continuous spatio-temporal features of the dynamical weather system. We address this issue with STC-ViT, a Spatio-Temporal Continuous Vision Transformer for weather forecasting. STC-ViT incorporates the continuous time Neural ODE layers with multi-head attention mechanism to learn the continuous weather evolution over time. The attention mechanism is encoded as a differentiable function in the transformer architecture to model the complex weather dynamics. Further, we define a customised physics informed loss for STC-ViT which penalize the model's predictions for deviating away from physical laws. We evaluate STC-ViT against operational Numerical Weather Prediction (NWP) model and several deep learning based weather forecasting models. STC-ViT, trained on 1.5-degree 6-hourly data, demonstrates computational efficiency and competitive performance compared to state-of-the-art data-driven models trained on higher-resolution data for global forecasting.
- Xcit: Cross-covariance image transformers. Advances in neural information processing systems, 34:20014–20027, 2021.
- The ECMWF scalability programme: Progress and plans. European Centre for Medium Range Weather Forecasts, 2020.
- Accurate medium-range global weather forecasting with 3d neural networks. Nature, pp. 1–6, 2023.
- Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 9650–9660, 2021.
- Neural ordinary differential equations. Advances in neural information processing systems, 31, 2018.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
- ECMWF. Ifs documentation cy48r1. ecmwf. 2023.
- Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 gpus with cosmo 5.0. Geoscientific Model Development, 11(4):1665–1681, 2018.
- Towards understanding normalization in neural odes. arXiv preprint arXiv:2004.09222, 2020.
- Escaping the big data paradigm with compact transformers. arXiv preprint arXiv:2104.05704, 2021.
- Era5 hourly data on single levels from 1959 to present [dataset]. copernicus climate change service (c3s) climate data store (cds), 2018.
- The era5 global reanalysis. Quarterly Journal of the Royal Meteorological Society, 146(730):1999–2049, 2020.
- Keisler, R. Forecasting global weather with graph neural networks. arXiv preprint arXiv:2202.07575, 2022.
- Fourcastnet: Accelerating global high-resolution weather forecasting using adaptive fourier neural operators. In Proceedings of the Platform for Advanced Scientific Computing Conference, pp. 1–11, 2023.
- Graphcast: Learning skillful medium-range global weather forecasting. arXiv preprint arXiv:2212.12794, 2022.
- Learning skillful medium-range global weather forecasting. Science, pp. eadi2336, 2023.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
- Efficient self-supervised vision transformers for representation learning. arXiv preprint arXiv:2106.09785, 2021.
- Climax: A foundation model for weather and climate. arXiv preprint arXiv:2301.10343, 2023a.
- Climatelearn: Benchmarking machine learning for weather and climate modeling. arXiv preprint arXiv:2307.01909, 2023b.
- Scaling transformer neural networks for skillful and reliable medium-range weather forecasting. arXiv preprint arXiv:2312.03876, 2023c.
- Representing model uncertainty in weather and climate prediction. Annu. Rev. Earth Planet. Sci., 33:163–193, 2005.
- Weatherbench: a benchmark data set for data-driven weather forecasting. Journal of Advances in Modeling Earth Systems, 12(11):e2020MS002203, 2020.
- The graph neural network model. IEEE transactions on neural networks, 20(1):61–80, 2008.
- Dyamond: the dynamics of the atmospheric general circulation modeled on non-hydrostatic domains. Progress in Earth and Planetary Science, 6(1):1–17, 2019.
- Training data-efficient image transformers & distillation through attention. In International conference on machine learning, pp. 10347–10357. PMLR, 2021.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Cvt: Introducing convolutions to vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 22–31, 2021.
- Co-scale conv-attentional image transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9981–9990, 2021.
- Tokens-to-token vit: Training vision transformers from scratch on imagenet. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 558–567, 2021.
- Hira Saleem (6 papers)
- Flora Salim (37 papers)
- Cormac Purcell (11 papers)