Balancing Computational Efficiency and Forecast Error in Machine Learning-based Time-Series Forecasting: Insights from Live Experiments on Meteorological Nowcasting (2309.15207v1)
Abstract: Machine learning for time-series forecasting remains a key area of research. Despite successful application of many machine learning techniques, relating computational efficiency to forecast error remains an under-explored domain. This paper addresses this topic through a series of real-time experiments to quantify the relationship between computational cost and forecast error using meteorological nowcasting as an example use-case. We employ a variety of popular regression techniques (XGBoost, FC-MLP, Transformer, and LSTM) for multi-horizon, short-term forecasting of three variables (temperature, wind speed, and cloud cover) for multiple locations. During a 5-day live experiment, 4000 data sources were streamed for training and inferencing 144 models per hour. These models were parameterized to explore forecast error for two computational cost minimization methods: a novel auto-adaptive data reduction technique (Variance Horizon) and a performance-based concept drift-detection mechanism. Forecast error of all model variations were benchmarked in real-time against a state-of-the-art numerical weather prediction model. Performance was assessed using classical and novel evaluation metrics. Results indicate that using the Variance Horizon reduced computational usage by more than 50\%, while increasing between 0-15\% in error. Meanwhile, performance-based retraining reduced computational usage by up to 90\% while \emph{also} improving forecast error by up to 10\%. Finally, the combination of both the Variance Horizon and performance-based retraining outperformed other model configurations by up to 99.7\% when considering error normalized to computational usage.
- Vladimir Vapnik. The nature of statistical learning theory. Springer science & business media, 1999.
- Incremental learning algorithms and applications. In European symposium on artificial neural networks (ESANN), 2016.
- Concept drift detection in data stream mining : A literature review. Journal of King Saud University - Computer and Information Sciences, 34(10, Part B):9523–9540, 2022. ISSN 1319-1578. doi: https://doi.org/10.1016/j.jksuci.2021.11.006. URL https://www.sciencedirect.com/science/article/pii/S1319157821003062.
- Efficient machine learning for big data: A review. Big Data Research, 2(3):87–93, 2015. ISSN 2214-5796. doi: https://doi.org/10.1016/j.bdr.2015.04.001. URL https://www.sciencedirect.com/science/article/pii/S2214579615000271. Big Data, Analytics, and High-Performance Computing.
- On training efficiency and computational costs of a feed forward neural network: A review. Intell. Neuroscience, 2015, jan 2015. ISSN 1687-5265. doi: 10.1155/2015/818243. URL https://doi.org/10.1155/2015/818243.
- Predictive modelling for solar thermal energy systems: A comparison of support vector regression, random forest, extra trees and regression trees. Journal of Cleaner Production, 203:810–821, 2018. ISSN 0959-6526. doi: https://doi.org/10.1016/j.jclepro.2018.08.207. URL https://www.sciencedirect.com/science/article/pii/S0959652618325551.
- Automatic environmental sound recognition: Performance versus computational cost. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(11):2096–2107, 2016. doi: 10.1109/TASLP.2016.2592698.
- From concept drift to model degradation: An overview on performance-aware drift detectors. Knowledge-Based Systems, page 108632, 2022.
- Peter Lynch. The emergence of numerical weather prediction: Richardson’s dream. Cambridge University Press, 2006.
- The quiet revolution of numerical weather prediction. Nature, 525(7567):47–55, 2015.
- A machine learning tutorial for operational meteorology. part i: Traditional machine learning. Weather and Forecasting, 37(8):1509–1529, 2022.
- Can deep learning beat numerical weather prediction? Philosophical Transactions of the Royal Society A, 379(2194):20200097, 2021.
- The diagnosis of upper-level humidity. Journal of Applied Meteorology and Climatology, 7(4):613–619, 1968.
- Generating probabilistic next-day severe weather forecasts from convection-allowing ensembles using random forests. Weather and Forecasting, 35(4):1605 – 1631, 2020. doi: https://doi.org/10.1175/WAF-D-19-0258.1. URL https://journals.ametsoc.org/view/journals/wefo/35/4/wafD190258.xml.
- A dual-frequency radar retrieval of two parameters of the snowfall particle size distribution using a neural network. Journal of Applied Meteorology and Climatology, 60(3):341–359, 2021.
- Estimating tropical cyclone intensity by satellite imagery utilizing convolutional neural networks. Weather and Forecasting, 34(2):447–465, 2019.
- Predicting rapid intensification in north atlantic and eastern north pacific tropical cyclones using a convolutional neural network. Weather and Forecasting, 37(8):1333–1355, 2022.
- Tropical and extratropical cyclone detection using deep learning. Journal of Applied Meteorology and Climatology, 59(12):1971–1985, 2020.
- Ryan Keisler. Forecasting global weather with graph neural networks. arXiv preprint arXiv:2202.07575, 2022.
- Graphcast: Learning skillful medium-range global weather forecasting. arXiv preprint arXiv:2212.12794, 2022.
- W-mae: Pre-trained weather model with masked autoencoder for multi-variable weather forecasting. arXiv preprint arXiv:2304.08754, 2023.
- Nowcasting guidelines – a summary. World Meteorological Organization, Bulletin nº : Vol 68 (2), 2019.
- Experiments in short-term precipitation forecasting using artificial neural networks. Monthly weather review, 126(2):470–482, 1998.
- Convolutional lstm network: A machine learning approach for precipitation nowcasting. Advances in neural information processing systems, 28, 2015.
- Nowcasting multicell short-term intense precipitation using graph models and random forests. Monthly Weather Review, 148(11):4453 – 4466, 2020. doi: https://doi.org/10.1175/MWR-D-20-0050.1. URL https://journals.ametsoc.org/view/journals/mwre/148/11/MWR-D-20-0050.1.xml.
- Jason D Stock. Using machine learning to improve vertical profiles of temperature and moisture for severe weather nowcasting. PhD thesis, Colorado State University, 2021.
- Distributed deep learning for precipitation nowcasting. In 2019 IEEE High Performance Extreme Computing Conference (HPEC), pages 1–7. IEEE, 2019.
- Real time weather prediction system using iot and machine learning. In 2020 6th International Conference on Signal Processing and Communication (ICSC), pages 322–324. IEEE, 2020.
- A real-time collaborative machine learning based weather forecasting system with multiple predictor locations. Array, 14:100153, 2022.
- Freqai: generalizing adaptive modeling for chaotic time-series market forecasts. Journal of Open Source Software, 7(80):4864, 2022. doi: 10.21105/joss.04864. URL https://doi.org/10.21105/joss.04864.
- OpenMeteo. Openmeteo documentation. https://open-meteo.com/en/docs. Accessed: 2023-05-01.
- Patrick Zippenfenig. High-resolution models and hourly updates for north america: Gfs and hrrr models now integrated. https://openmeteo.substack.com/p/high-resolution-models-and-hourly, a. Accessed: 2023-05-03.
- A global oceanic data assimilation system. Journal of physical oceanography, 19(9):1333–1347, 1989.
- Evaluation of the naqfc driven by the noaa global forecast system (version 16): comparison with the wrf-cmaq during the summer 2019 firex-aq campaign. Geoscientific Model Development, 15(21):7977–7999, 2022.
- National Oceanic and Atmospheric Administration; U.S. Department of Commerce. The global forecast system (gfs). https://www.emc.ncep.noaa.gov/emc/pages/numerical_forecast_systems/gfs.php. Accessed: 2023-04-26.
- The high-resolution rapid refresh (hrrr): An hourly updating convection-allowing forecast model. part i: Motivation and system description. Weather and Forecasting, 37(8):1371–1395, 2022.
- The high-resolution rapid refresh (hrrr): An hourly updating convection-allowing forecast model. part ii: Forecast performance. Weather and Forecasting, 37(8):1397–1417, 2022.
- Patrick Zippenfenig. Best weather models in one open-source api: Compare noaa gfs, meteofrance arome, dwd icon, ecmwf and jma. https://openmeteo.substack.com/p/best-weather-models-in-one-open-source, b. Accessed: 2023-05-03.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- xgboost developers. Xgboost python package. https://xgboost.readthedocs.io/en/stable/python/index.html. Accessed: 2023-05-03.
- Parallel Distributed Processing, Volume 2: Explorations in the Microstructure of Cognition: Psychological and Biological Models, volume 2. MIT press, 1987.
- PyTorch Contributors. Pytorch documentation. https://pytorch.org/docs/stable/index.html. Accessed: 2023-05-03.
- Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1):1929–1958, 2014.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Dask Development Team. Dask: Library for dynamic task scheduling. https://dask.org, 2016.