Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TILDE-Q: A Transformation Invariant Loss Function for Time-Series Forecasting (2210.15050v2)

Published 26 Oct 2022 in cs.LG

Abstract: Time-series forecasting has gained increasing attention in the field of artificial intelligence due to its potential to address real-world problems across various domains, including energy, weather, traffic, and economy. While time-series forecasting is a well-researched field, predicting complex temporal patterns such as sudden changes in sequential data still poses a challenge with current models. This difficulty stems from minimizing Lp norm distances as loss functions, such as mean absolute error (MAE) or mean square error (MSE), which are susceptible to both intricate temporal dynamics modeling and signal shape capturing. Furthermore, these functions often cause models to behave aberrantly and generate uncorrelated results with the original time-series. Consequently, developing a shape-aware loss function that goes beyond mere point-wise comparison is essential. In this paper, we examine the definition of shape and distortions, which are crucial for shape-awareness in time-series forecasting, and provide a design rationale for the shape-aware loss function. Based on our design rationale, we propose a novel, compact loss function called TILDEQ (Transformation Invariant Loss function with Distance EQuilibrium) that considers not only amplitude and phase distortions but also allows models to capture the shape of time-series sequences. Furthermore, TILDE-Q supports the simultaneous modeling of periodic and nonperiodic temporal dynamics. We evaluate the efficacy of TILDE-Q by conducting extensive experiments under both periodic and nonperiodic conditions with various models ranging from naive to state-of-the-art. The experimental results show that the models trained with TILDE-Q surpass those trained with other metrics, such as MSE and DILATE, in various real-world applications, including electricity, traffic, illness, economics, weather, and electricity transformer temperature (ETT).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. Learning a warping distance from unlabeled time series using sequence autoencoders. In Advances in Neural Information Processing Systems, volume 31, pp.  10568–10578, 2018.
  2. Representation of process trends—iv. induction of real-time patterns from operating data for diagnosis and supervisory control. Computers & Chemical Engineering, 18(4):303–332, 1994.
  3. CID: an efficient complexity-invariant distance for time series. Data Mining and Knowledge Discovery, 28(3):634–669, 2014. doi: 10.1007/s10618-013-0312-3.
  4. On adaptive control processes. IRE Transactions on Automatic Control, 4(2):1–9, 1959.
  5. Berkhin, P. A survey of clustering data mining techniques. In Grouping Multidimensional Data - Recent Advances in Clustering, pp.  25–71. Springer, 2006.
  6. Using dynamic time warping to find patterns in time series. In Proceedings of the International Conference on Knowledge Discovery and Data Mining, AAAIWS’94, pp.  359–370. AAAI Press, 1994.
  7. Estimating counterfactual treatment outcomes over time through adversarially balanced representations. In International Conference on Learning Representations, 2020.
  8. Time series analysis: forecasting and control. John Wiley, 2015.
  9. Fast and accurate deep network learning by exponential linear units (elus). In Proceedings of the International Conference on Learning Representations, 2016.
  10. Soft-dtw: A differentiable loss function for time-series. In Proceedings of the 34th International Conference on Machine Learning, ICML’17, pp.  894–903, 2017.
  11. Finding similar time series. In Principles of Data Mining and Knowledge Discovery, pp.  88–100, 1997.
  12. The ucr time series archive. IEEE/CAA Journal of Automatica Sinica, 6(6):1293–1305, 2019.
  13. Querying and mining of time series data: Experimental comparison of representations and distance measures. Proceedings of the VLDB Endowment, 1(2):1542–1552, 2008.
  14. Time-series data mining. ACM Computing Surveys, 45(1), 2012.
  15. Dynamic state warping. CoRR, abs/1703.01141, 2017.
  16. Learning to remember rare events. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, 2017.
  17. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  18. Keogh, E. J. Efficiently finding arbitrarily scaled patterns in massive time series databases. In Knowledge Discovery in Databases: PKDD 2003, volume 2838 of Lecture Notes in Computer Science, pp.  253–265, 2003.
  19. Exact indexing of dynamic time warping. Knowledge and Information Systems, 7(3):358–386, 2005.
  20. Clustering of time series subsequences is meaningless: Implications for previous and future research. In Proceedings of the IEEE International Conference on Data Mining, pp.  115–122. IEEE Computer Society, 2003.
  21. Indexing large human-motion databases. In Proceedings of the International Conference on Very Large Data Bases, pp.  780–791, 2004.
  22. Techniques for clustering gene expression data. Computers in Biology and Medicine, 38(3):283–293, 2008.
  23. Shape and time distortion loss for training deep time series forecasting models. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
  24. A visual analytics system for exploring, monitoring, and forecasting road traffic congestion. IEEE Transactions on Visualization and Computer Graphics, 26(11):3133–3146, 2020. doi: 10.1109/TVCG.2019.2922597.
  25. Learning to remember patterns: Pattern matching memory networks for traffic forecasting. In International Conference on Learning Representations, 2022.
  26. Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. In Proceedings of the International Conference on Learning Representations. OpenReview.net, 2018.
  27. Non-stationary transformers: Exploring the stationarity in time series forecasting. In Advances in Neural Information Processing Systems, 2022.
  28. itransformer: Inverted transformers are effective for time series forecasting. arXiv preprint arXiv:2310.06625, 2023.
  29. Mem2seq: Effectively incorporating knowledge bases into end-to-end task-oriented dialog systems. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, pp.  1468–1478, 2018.
  30. FUNNEL: automatic mining of spatially coevolving epidemics. In The ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.  105–114. ACM, 2014.
  31. Differentiable dynamic programming for structured prediction and attention. In Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pp.  3462–3471, 2018.
  32. Understanding and interpreting dominant frequency analysis of af electrograms. Journal of Cardiovascular Electrophysiology, 18(6):680–685, 2007.
  33. A time series is worth 64 words: Long-term forecasting with transformers. In International Conference on Learning Representations, 2023.
  34. K-shape: Efficient and accurate clustering of time series. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD ’15, pp.  1855–1870, 2015. doi: 10.1145/2723372.2737793.
  35. Forecasting time series subject to multiple structural breaks. Cambridge Working Papers in Economics 0433, Faculty of Economics, University of Cambridge, 2004.
  36. Efficient time series classification under template matching using time warping alignment. In Proceedings of the International Conference on Computer Sciences and Convergence Information Technology, pp.  685–690, 2009.
  37. Seq-u-net: A one-dimensional causal u-net for efficient sequence modelling. In Bessiere, C. (ed.), Proceedings of the International Joint Conference on Artificial Intelligence, pp.  2893–2900. ijcai.org, 2020.
  38. End-to-end memory networks. In Advances in Neural Information Processing Systems, volume 28, 2015.
  39. Attention is all you need. In Guyon, I., von Luxburg, U., Bengio, S., Wallach, H. M., Fergus, R., Vishwanathan, S. V. N., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, pp.  5998–6008, 2017.
  40. Indexing multi-dimensional time-series with support for multiple distance measures. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’03, pp.  216–225, 2003.
  41. On periodicity detection and structural periodic similarity. In Proceedings of the SIAM International Conference on Data Mining, pp.  449–460, 2005.
  42. Warren Liao, T. Clustering of time series data—a survey. Pattern Recognition, 38(11):1857–1874, 2005.
  43. Time Series Prediction: Forecasting the Future and Understanding the Past. Addison-Wesley, 1994. ISBN 0-201-62601-2.
  44. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. In Advances in Neural Information Processing Systems, volume 34, pp.  22419–22430, 2021.
  45. Timesnet: Temporal 2d-variation modeling for general time series analysis. In International Conference on Learning Representations, 2023.
  46. Fast time series classification using numerosity reduction. In Proceedings of the International Conference on Machine Learning, ICML ’06, pp.  1033–1040. Association for Computing Machinery, 2006.
  47. Dilated residual networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  636–644. IEEE Computer Society, 2017.
  48. Are transformers effective for time series forecasting? In Proceedings of the AAAI Conference on Artificial Intelligence, 2023.
  49. Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting. In The Eleventh International Conference on Learning Representations, 2023.
  50. Informer: Beyond efficient transformer for long sequence time-series forecasting. Proceedings of the AAAI Conference on Artificial Intelligence, 35(12):11106–11115, 2021.
  51. FEDformer: Frequency enhanced decomposed transformer for long-term series forecasting. In Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pp.  27268–27286, 2022.
  52. Statstream: Statistical monitoring of thousands of data streams in real time. In Proceedings of the International Conference on Very Large Databases, pp.  358–369. Morgan Kaufmann, 2002.

Summary

We haven't generated a summary for this paper yet.