Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Loss Shaping Constraints for Long-Term Time Series Forecasting (2402.09373v2)

Published 14 Feb 2024 in cs.LG and stat.ML

Abstract: Several applications in time series forecasting require predicting multiple steps ahead. Despite the vast amount of literature in the topic, both classical and recent deep learning based approaches have mostly focused on minimising performance averaged over the predicted window. We observe that this can lead to disparate distributions of errors across forecasting steps, especially for recent transformer architectures trained on popular forecasting benchmarks. That is, optimising performance on average can lead to undesirably large errors at specific time-steps. In this work, we present a Constrained Learning approach for long-term time series forecasting that aims to find the best model in terms of average performance that respects a user-defined upper bound on the loss at each time-step. We call our approach loss shaping constraints because it imposes constraints on the loss at each time step, and leverage recent duality results to show that despite its non-convexity, the resulting problem has a bounded duality gap. We propose a practical Primal-Dual algorithm to tackle it, and demonstrate that the proposed approach exhibits competitive average performance in time series forecasting benchmarks, while shaping the distribution of errors across the predicted window.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Stg2seq: Spatial-temporal graph to sequence model for multi-step passenger demand forecasting. arXiv preprint arXiv:1905.10069, 2019.
  2. Multi-step-ahead time series prediction using multiple-output support vector regression. Neurocomputing, 129:482–493, 2014.
  3. Bontempi, G. Long term time series prediction with multi-input multi-output local learning. 2008. URL https://api.semanticscholar.org/CorpusID:137250.
  4. Conditionally dependent strategies for multiple-step-ahead prediction in local learning. International Journal of Forecasting, 27:689–699, 07 2011. doi: 10.1016/j.ijforecast.2010.09.004.
  5. Machine learning strategies for time series forecasting. Business Intelligence: Second European Summer School, eBISS 2012, Brussels, Belgium, July 15-21, 2012, Tutorial Lectures 2, pp.  62–77, 2013.
  6. Time series analysis. forecasting and control. Holden-Day Series in Time Series Analysis, 1976.
  7. Convex optimization. Cambridge university press, 2004.
  8. Probably approximately correct constrained learning. Advances in Neural Information Processing Systems, 33:16722–16735, 2020.
  9. Constrained learning with non-convex losses. IEEE Transactions on Information Theory, 69(3):1739–1760, 2022.
  10. Forecasting and stress testing with quantile vector autoregression. Available at SSRN 3489065, 2019.
  11. Multi-step-ahead crude oil price forecasting using a hybrid grey wave model. Physica A: Statistical Mechanics and its Applications, 501:98–110, 2018.
  12. Fitting imbalanced uncertainties in multi-output time series forecasting. ACM Transactions on Knowledge Discovery from Data, 17(7):1–23, 2023.
  13. Chevillon, G. Direct multi-step estimation and forecasting. Journal of Economic Surveys, 21(4):746–785, 2007.
  14. On the constrained time-series generation problem. arXiv preprint arXiv:2307.01717, 2023.
  15. Long-term forecasting with tide: Time-series dense encoder. arXiv preprint arXiv:2304.08424, 2023a.
  16. A decoder-only foundation model for time-series forecasting, 2023b.
  17. A lagrangian duality approach to active learning. In Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., and Oh, A. (eds.), Advances in Neural Information Processing Systems, volume 35, pp.  37575–37589. Curran Associates, Inc., 2022. URL https://proceedings.neurips.cc/paper_files/paper/2022/file/f475bdd151d8b5fa01215aeda925e75c-Paper-Conference.pdf.
  18. Near-optimal solutions of constrained learning problems. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=fDaLmkdSKU.
  19. Controlled sparsity via constrained optimization or: How i learned to stop tuning penalties and love constraints. In Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., and Oh, A. (eds.), Advances in Neural Information Processing Systems, volume 35, pp.  1253–1266. Curran Associates, Inc., 2022. URL https://proceedings.neurips.cc/paper_files/paper/2022/file/089b592cccfafdca8e0178e85b609f19-Paper-Conference.pdf.
  20. Timegpt-1. arXiv preprint arXiv:2310.03589, 2023.
  21. Multi-step prediction for nonlinear autoregressive models based on empirical distributions. Statistica Sinica, 9(2):559–570, 1999. ISSN 10170405, 19968507. URL http://www.jstor.org/stable/24306598.
  22. Comparison of direct and iterative artificial neural network forecast approaches in multi-periodic time series forecasting. Expert systems with applications, 36(2):3839–3844, 2009.
  23. Hansen, B. E. Multi-step forecast model selection. In 20th Annual Meetings of the Midwest Econometrics Group, 2010.
  24. Neural network models for time series forecasts. Management science, 42(7):1082–1092, 1996.
  25. Multilayer feedforward networks are universal approximators. Neural networks, 2(5):359–366, 1989.
  26. Resilient constrained learning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=h0RVoZuUl6.
  27. Studies in linear and non-linear programming, by k. j. arrow, l. hurwicz and h. uzawa. stanford university press, 1958. 229 pages. Canadian Mathematical Bulletin, 3(3):196–198, 1960. doi: 10.1017/S0008439500025522.
  28. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  29. Reformer: The efficient transformer. In International Conference on Learning Representations, 2019.
  30. Kline, D. M. Methods for multi-step time series forecasting neural networks. In Neural networks in business forecasting, pp.  226–250. IGI Global, 2004.
  31. Modeling long-and short-term temporal patterns with deep neural networks. In The 41st international ACM SIGIR conference on research & development in information retrieval, pp.  95–104, 2018.
  32. Pyraformer: Low-complexity pyramidal attention for long-range time series modeling and forecasting. In International conference on learning representations, 2021.
  33. Minimax estimation for time series models. METRON, 79(3):353–359, 2021. doi: 10.1007/s40300-021-00217-6. URL https://doi.org/10.1007/s40300-021-00217-6.
  34. Foundations of machine learning. MIT press, 2018.
  35. A time series is worth 64 words: Long-term forecasting with transformers. arXiv preprint arXiv:2211.14730, 2022.
  36. Adversarial robustness with semi-infinite constrained learning. Advances in Neural Information Processing Systems, 34:6198–6215, 2021.
  37. Methodology for long-term prediction of time series. Neurocomputing, 70(16-18):2861–2869, 2007.
  38. Tales from tails: On the empirical distributions of forecasting errors and their implication to risk. International Journal of Forecasting, 35(2):687–698, 2019. ISSN 0169-2070. doi: https://doi.org/10.1016/j.ijforecast.2018.10.004. URL https://www.sciencedirect.com/science/article/pii/S0169207018301547.
  39. Taleb, N. N. Errors, robustness, and the fourth quadrant. International Journal of Forecasting, 25(4):744–759, 2009. ISSN 0169-2070. doi: https://doi.org/10.1016/j.ijforecast.2009.05.027. URL https://www.sciencedirect.com/science/article/pii/S016920700900096X. Special section: Decision making and planning under low levels of predictability.
  40. Learning model predictive control with long short-term memory networks. International Journal of Robust and Nonlinear Control, 31(18):8877–8896, 2021. doi: https://doi.org/10.1002/rnc.5519. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/rnc.5519.
  41. Some advances in non-linear and adaptive modelling in time-series. Journal of forecasting, 13(2):109–131, 1994.
  42. Analysis and application of forecasting models in wind power integration: A review of multi-step-ahead wind speed forecasting models. Renewable and Sustainable Energy Reviews, 60:960–981, 2016.
  43. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34:22419–22430, 2021.
  44. Electric vehicle charging demand forecasting using deep learning model. Journal of Intelligent Transportation Systems, 26(6):690–703, 2022.
  45. Are transformers universal approximators of sequence-to-sequence functions? arXiv preprint arXiv:1912.10077, 2019.
  46. Are transformers effective for time series forecasting? In Proceedings of the AAAI conference on artificial intelligence, volume 37, pp.  11121–11128, 2023.
  47. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pp.  11106–11115, 2021.
  48. FEDformer: Frequency enhanced decomposed transformer for long-term series forecasting. In Chaudhuri, K., Jegelka, S., Song, L., Szepesvari, C., Niu, G., and Sabato, S. (eds.), Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pp.  27268–27286. PMLR, 17–23 Jul 2022. URL https://proceedings.mlr.press/v162/zhou22g.html.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com