Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Scalable and Transferable Time Series Prediction Framework for Demand Forecasting (2402.19402v1)

Published 29 Feb 2024 in cs.LG and cs.AI

Abstract: Time series forecasting is one of the most essential and ubiquitous tasks in many business problems, including demand forecasting and logistics optimization. Traditional time series forecasting methods, however, have resulted in small models with limited expressive power because they have difficulty in scaling their model size up while maintaining high accuracy. In this paper, we propose Forecasting orchestra (Forchestra), a simple but powerful framework capable of accurately predicting future demand for a diverse range of items. We empirically demonstrate that the model size is scalable to up to 0.8 billion parameters. The proposed method not only outperforms existing forecasting models with a significant margin, but it could generalize well to unseen data points when evaluated in a zero-shot fashion on downstream datasets. Last but not least, we present extensive qualitative and quantitative studies to analyze how the proposed model outperforms baseline models and differs from conventional approaches. The original paper was presented as a full paper at ICDM 2022 and is available at: https://ieeexplore.ieee.org/document/10027662.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. R. Fildes, S. Ma, and S. Kolassa, “Retail forecasting: Research and practice,” International Journal of Forecasting, 2019.
  2. S. Makridakis, A. Andersen, R. Carbone, R. Fildes, M. Hibon, R. Lewandowski, J. Newton, E. Parzen, and R. Winkler, “The accuracy of extrapolation (time series) methods: Results of a forecasting competition,” Journal of forecasting, vol. 1, no. 2, pp. 111–153, 1982.
  3. S. Makridakis, C. Chatfield, M. Hibon, M. Lawrence, T. Mills, K. Ord, and L. F. Simmons, “The m2-competition: A real-time judgmentally based forecasting study,” International Journal of forecasting, vol. 9, no. 1, pp. 5–22, 1993.
  4. S. Makridakis and M. Hibon, “The m3-competition: results, conclusions and implications,” International journal of forecasting, vol. 16, no. 4, pp. 451–476, 2000.
  5. S. Makridakis, E. Spiliotis, and V. Assimakopoulos, “The m4 competition: Results, findings, conclusion and way forward,” International Journal of Forecasting, vol. 34, no. 4, pp. 802–808, 2018.
  6. S. Makridakis, V. Assimakopoulos, and E. Spiliotis, “The m5 competition: Background, organization, and implementation,” International Journal of Forecasting, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0169207021001187
  7. J. Brownlee, “Comparing classical and machine learning algorithms for time series forecasting,” Machine Learning Mastery, Australia, 2019.
  8. S. Jung, K.-M. Kim, H. Kwak, and Y.-J. Park, “A worrying analysis of probabilistic time-series models for sales forecasting,” PMLR, 2020.
  9. R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill et al., “On the opportunities and risks of foundation models,” arXiv preprint arXiv:2108.07258, 2021.
  10. A. Das, W. Kong, R. Sen, and Y. Zhou, “A decoder-only foundation model for time-series forecasting,” arXiv preprint arXiv:2310.10688, 2023.
  11. A. Garza and M. Mergenthaler-Canseco, “Timegpt-1,” arXiv preprint arXiv:2310.03589, 2023.
  12. L. N. Darlow, Q. Deng, A. Hassan, M. Asenov, R. Singh, A. Joosen, A. Barker, and A. Storkey, “Dam: A foundation model for forecasting,” in The Twelfth International Conference on Learning Representations, 2023.
  13. M. Oliveira and L. Torgo, “Ensembles for time series forecasting,” in Asian Conference on Machine Learning.   PMLR, 2015, pp. 360–370.
  14. J. Gastinger, S. Nicolas, D. Stepić, M. Schmidt, and A. Schülke, “A study on ensemble learning for time series forecasting and the need for meta-learning,” arXiv preprint arXiv:2104.11475, 2021.
  15. S. Fort, H. Hu, and B. Lakshminarayanan, “Deep ensembles: A loss landscape perspective,” arXiv preprint arXiv:1912.02757, 2019.
  16. S. J. Taylor and B. Letham, “Forecasting at scale,” The American Statistician, vol. 72, no. 1, pp. 37–45, 2018.
  17. D. Salinas, V. Flunkert, J. Gasthaus, and T. Januschowski, “Deepar: Probabilistic forecasting with autoregressive recurrent networks,” International Journal of Forecasting, vol. 36, no. 3, pp. 1181–1191, 2020.
  18. Z. Yue, Y. Wang, J. Duan, T. Yang, C. Huang, Y. Tong, and B. Xu, “Ts2vec: Towards universal representation of time series,” arXiv preprint arXiv:2106.10466, 2021.
  19. B. N. Oreshkin, D. Carpov, N. Chapados, and Y. Bengio, “N-beats: Neural basis expansion analysis for interpretable time series forecasting,” arXiv preprint arXiv:1905.10437, 2019.
  20. A. Timmermann, “Forecast combinations,” Handbook of economic forecasting, vol. 1, pp. 135–196, 2006.
  21. P. Montero-Manso, G. Athanasopoulos, R. J. Hyndman, and T. S. Talagala, “Fforma: Feature-based forecast model averaging,” International Journal of Forecasting, vol. 36, no. 1, pp. 86–92, 2020.
  22. V. Cerqueira, L. Torgo, M. Oliveira, and B. Pfahringer, “Dynamic and heterogeneous ensembles for time series forecasting,” in 2017 IEEE international conference on data science and advanced analytics (DSAA).   IEEE, 2017, pp. 242–251.
  23. M. Pawlikowski and A. Chorowska, “Weighted ensemble of statistical models,” International Journal of Forecasting, vol. 36, no. 1, pp. 93–97, 2020.
  24. R. Hyndman, Y. Kang, P. Montero-Manso, T. Talagala, E. Wang, Y. Yang, and M. O’Hara-Wild, “tsfeatures: Time series feature extraction,” R package version, vol. 1, no. 0, 2019.
  25. R. Caruana, A. Niculescu-Mizil, G. Crew, and A. Ksikes, “Ensemble selection from libraries of models,” in Proceedings of the twenty-first international conference on Machine learning, 2004, p. 18.
  26. A. Zeevi, R. Meir, and R. Adler, “Time series prediction using mixtures of experts,” Advances in neural information processing systems, vol. 9, 1996.
  27. T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in International conference on machine learning.   PMLR, 2020, pp. 1597–1607.
  28. K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 9729–9738.
  29. T. Gao, X. Yao, and D. Chen, “Simcse: Simple contrastive learning of sentence embeddings,” arXiv preprint arXiv:2104.08821, 2021.
  30. J.-Y. Franceschi, A. Dieuleveut, and M. Jaggi, “Unsupervised scalable representation learning for multivariate time series,” Advances in Neural Information Processing Systems, vol. 32, pp. 4650–4661, 2019.
  31. E. Eldele, M. Ragab, Z. Chen, M. Wu, C. K. Kwoh, X. Li, and C. Guan, “Time-series representation learning via temporal and contextual contrasting,” in Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, 2021, pp. 2352–2359.
  32. R. Hadsell, S. Chopra, and Y. LeCun, “Dimensionality reduction by learning an invariant mapping,” in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2.   IEEE, 2006, pp. 1735–1742.
  33. L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” Journal of machine learning research, vol. 9, no. 11, 2008.
  34. T. G. Smith et al., “pmdarima: Arima estimators for Python,” 2017–, [Online; accessed ¡today¿]. [Online]. Available: http://www.alkaline-ml.com/pmdarima
  35. R. Wen, K. Torkkola, B. Narayanaswamy, and D. Madeka, “A multi-horizon quantile recurrent forecaster,” arXiv preprint arXiv:1711.11053, 2017.
  36. S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
  37. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in neural information processing systems, 2017, pp. 5998–6008.
  38. R. J. Hyndman and A. B. Koehler, “Another look at measures of forecast accuracy,” International journal of forecasting, vol. 22, no. 4, pp. 679–688, 2006.
  39. W. Webber, A. Moffat, and J. Zobel, “A similarity measure for indefinite rankings,” ACM Transactions on Information Systems (TOIS), vol. 28, no. 4, pp. 1–38, 2010.
  40. A. Alexandrov, K. Benidis, M. Bohlke-Schneider, V. Flunkert, J. Gasthaus, T. Januschowski, D. C. Maddix, S. Rangapuram, D. Salinas, J. Schulz et al., “Gluonts: Probabilistic time series models in python,” arXiv preprint arXiv:1906.05264, 2019.
  41. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, Eds.   Curran Associates, Inc., 2019, pp. 8024–8035. [Online]. Available: http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
Citations (2)

Summary

We haven't generated a summary for this paper yet.