Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

IN-Flow: Instance Normalization Flow for Non-stationary Time Series Forecasting (2401.16777v2)

Published 30 Jan 2024 in cs.LG

Abstract: Due to the non-stationarity of time series, the distribution shift problem largely hinders the performance of time series forecasting. Existing solutions either rely on using certain statistics to specify the shift, or developing specific mechanisms for certain network architectures. However, the former would fail for the unknown shift beyond simple statistics, while the latter has limited compatibility on different forecasting models. To overcome these problems, we first propose a decoupled formulation for time series forecasting, with no reliance on fixed statistics and no restriction on forecasting architectures. This formulation regards the removing-shift procedure as a special transformation between a raw distribution and a desired target distribution and separates it from the forecasting. Such a formulation is further formalized into a bi-level optimization problem, to enable the joint learning of the transformation (outer loop) and forecasting (inner loop). Moreover, the special requirements of expressiveness and bi-direction for the transformation motivate us to propose instance normalization flow (IN-Flow), a novel invertible network for time series transformation. Different from the classic "normalizing flow" models, IN-Flow does not aim for normalizing input to the prior distribution (e.g., Gaussian distribution) for generation, but creatively transforms time series distribution by stacking normalization layers and flow-based invertible networks, which is thus named "normalization" flow. Finally, we have conducted extensive experiments on both synthetic data and real-world data, which demonstrate the superiority of our method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. Grey prediction with rolling mechanism for electricity demand forecasting of turkey. energy, 32(9):1670–1675, 2007.
  2. Rafał Weron. Electricity price forecasting: A review of the state-of-the-art with a look into the future. International journal of forecasting, 30(4):1030–1081, 2014.
  3. Traffic flow prediction with big data: a deep learning approach. IEEE Transactions on Intelligent Transportation Systems, 16(2):865–873, 2014.
  4. Weather forecasting model using artificial neural network. Procedia Technology, 4:311–318, 2012.
  5. Charles C Holt. Forecasting trends and seasonal by exponentially weighted moving averages. ONR Memorandum, 52(2), 1957.
  6. Charles C Holt. Forecasting seasonals and trends by exponentially weighted moving averages. International Journal of Forecasting, 20(1):5–10, 2004.
  7. Peter R Winters. Forecasting sales by exponentially weighted moving averages. Management science, 6(3):324–342, 1960.
  8. Peter Whittle. Hypothesis testing in time series analysis. Almqvist & Wiksells boktr., 1951.
  9. Peter Whittle. Prediction and regulation by linear least-square methods. English Universities Press, 1963.
  10. Deepar: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting, 36(3):1181–1191, 2020.
  11. Deep state space models for time series forecasting. Advances in neural information processing systems, 31:7785–7794, 2018.
  12. N-beats: Neural basis expansion analysis for interpretable time series forecasting. In International Conference on Learning Representations, 2020.
  13. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 11106–11115, 2021.
  14. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34:22419–22430, 2021.
  15. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. arXiv preprint arXiv:2201.12740, 2022.
  16. Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations, 2022.
  17. Non-stationary transformers: Exploring the stationarity in time series forecasting. In Advances in Neural Information Processing Systems, 2022.
  18. Adaptive normalization: A novel data normalization approach for non-stationary time series. In The 2010 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2010.
  19. Adarnn: Adaptive learning and forecasting of time series. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management, pages 402–411, 2021.
  20. An overview of bilevel optimization. Annals of operations research, 153(1):235–256, 2007.
  21. Bilevel programming for hyperparameter optimization and meta-learning. In International Conference on Machine Learning, pages 1568–1577. PMLR, 2018.
  22. Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516, 2014.
  23. Density estimation using real nvp. arXiv preprint arXiv:1605.08803, 2016.
  24. Glow: Generative flow with invertible 1x1 convolutions. Advances in neural information processing systems, 31, 2018.
  25. Analyzing inverse problems with invertible neural networks. arXiv preprint arXiv:1808.04730, 2018.
  26. Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022, 2016.
  27. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE international conference on computer vision, pages 1501–1510, 2017.
  28. Joint air quality and weather prediction based on multi-adversarial spatiotemporal networks. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, 2021.
  29. Feature and instance joint selection: A reinforcement learning perspective. arXiv preprint arXiv:2205.07867, 2022.
  30. Dewp: Deep expansion learning for wind power forecasting. ACM Transactions on Knowledge Discovery from Data, 2023.
  31. The bigger the better? rethinking the effective model scale in long-term time series forecasting. arXiv preprint arXiv:2401.11929, 2024.
  32. Frequency-domain mlps are more effective learners in time series forecasting. arXiv preprint arXiv:2311.06184, 2023.
  33. A survey on deep learning based time series analysis with frequency transformation. CoRR, abs/2302.02173, 2023.
  34. Residual recurrent highway networks for learning deep sequence prediction models. Journal of Grid Computing, 18(1):169–176, 2020.
  35. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271, 2018.
  36. SCINet: Time series modeling and forecasting with sample convolution and interaction. In Advances in Neural Information Processing Systems, 2022.
  37. DEPTS: Deep expansion learning for periodic time series forecasting. In International Conference on Learning Representations, 2022.
  38. Are transformers effective for time series forecasting? arXiv preprint arXiv:2205.13504, 2022.
  39. Attention is all you need. In Advances in neural information processing systems, pages 5998–6008, 2017.
  40. Deep adaptive input normalization for time series forecasting. IEEE transactions on neural networks and learning systems, 31(9):3760–3765, 2019.
  41. Dish-ts: A general paradigm for alleviating distribution shift in time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 7522–7529, 2023.
  42. A review on bilevel optimization: from classical to evolutionary approaches and applications. IEEE Transactions on Evolutionary Computation, 22(2):276–295, 2017.
  43. Dynamic conceptional contrastive learning for generalized category discovery. 2023.
  44. A memorizing and generalizing framework for lifelong person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(11):13567–13585, 2023.
  45. Normalizing flows for probabilistic modeling and inference. J. Mach. Learn. Res., 22(57):1–64, 2021.
  46. Masked autoregressive flow for density estimation. Advances in neural information processing systems, 30, 2017.
  47. Gradient-based bi-level optimization for deep learning: A survey. arXiv preprint arXiv:2207.11719, 2022.
  48. Darts: Differentiable architecture search. In International Conference on Learning Representations, 2018.
  49. Normalizing flows: An introduction and review of current methods. IEEE transactions on pattern analysis and machine intelligence, 43(11):3964–3979, 2020.
  50. Powernorm: Rethinking batch normalization in transformers. In International Conference on Machine Learning, pages 8741–8751. PMLR, 2020.
  51. A learned representation for artistic style. arXiv preprint arXiv:1610.07629, 2016.
  52. N-beats: Neural basis expansion analysis for interpretable time series forecasting. arXiv preprint arXiv:1905.10437, 2019.
  53. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  54. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, pages 448–456. PMLR, 2015.
  55. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets