Extended Deep Adaptive Input Normalization for Preprocessing Time Series Data for Neural Networks (2310.14720v2)
Abstract: Data preprocessing is a crucial part of any machine learning pipeline, and it can have a significant impact on both performance and training efficiency. This is especially evident when using deep neural networks for time series prediction and classification: real-world time series data often exhibit irregularities such as multi-modality, skewness and outliers, and the model performance can degrade rapidly if these characteristics are not adequately addressed. In this work, we propose the EDAIN (Extended Deep Adaptive Input Normalization) layer, a novel adaptive neural layer that learns how to appropriately normalize irregular time series data for a given task in an end-to-end fashion, instead of using a fixed normalization scheme. This is achieved by optimizing its unknown parameters simultaneously with the deep neural network using back-propagation. Our experiments, conducted using synthetic data, a credit default prediction dataset, and a large-scale limit order book benchmark dataset, demonstrate the superior performance of the EDAIN layer when compared to conventional normalization methods and existing adaptive time series preprocessing layers.
- Inter-coder agreement for computational linguistics. Computational Linguistics, 34(4):555–596.
- Atkinson, K. E. (1989). An Introduction to Numerical Analysis. John Wiley & Sons, New York, 2nd edition.
- An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 26(2):211–252.
- Brits: Bidirectional recurrent imputation for time series. In Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc.
- Improving synthetic CT accuracy by combining the benefits of multiple normalized preprocesses. Journal of Applied Clinical Medical Physics, page e14004.
- Introduction to Algorithms. The MIT Press, 2nd edition.
- Higham, N. J. (1988). Computing a nearest symmetric positive semidefinite matrix. Linear Algebra and its Applications, 103:103–118.
- Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning, (4):26–31.
- American Express – Default Prediction. https://kaggle.com/competitions/amex-default-prediction. Accessed: 2023-06-09.
- Normalization Techniques in Training DNNs: Methodology, Analysis and Application. IEEE Transactions on Pattern Analysis & Machine Intelligence, 45(08):10173–10196.
- Forecasting: principles and practice, 2nd edition. OTexts, Melbourne, Australia. Accessed: 29-08-2023.
- Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Bach, F. and Blei, D., editors, Proceedings of the 32nd International Conference on Machine Learning, volume 37, pages 448–456, Lille, France. PMLR.
- Modelling high-frequency limit order book dynamics with support vector machines. Quantitative Finance, 15(8):1315–1329.
- Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA.
- Normalizing flows: An introduction and review of current methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(11):3964–3979.
- Koval, S. I. (2018). Data preparation for neural network data analysis. In 2018 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus), pages 898–901.
- Improving the accuracy of estimates of Gini coefficients. Journal of Econometrics, 42(1):43–47.
- Beyond batchnorm: Towards a unified understanding of normalization in deep learning. In Advances in Neural Information Processing Systems, volume 34. Curran Associates, Inc.
- McCarter, C. (2023). The kernel density integral transformation. Transactions on Machine Learning Research.
- The effect of data pre-processing on optimized training of artificial neural networks. Procedia Technology, 11:32–39. 4th International Conference on Electrical Engineering and Informatics, ICEEI 2013.
- Benchmark dataset for mid-price forecasting of limit order book data with machine learning methods. Journal of Forecasting, 37(8):852–866.
- The effects of handling outliers on the performance of bankruptcy prediction models. Socio-Economic Planning Sciences, 67:34–42.
- Forecasting financial time series using robust deep adaptive input normalization. Journal of Signal Processing Systems, 93(10):1235–1251.
- Deep adaptive input normalization for time series forecasting. IEEE Transactions on Neural Networks and Learning Systems, 31(9):3760–3765.
- Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc.
- Real elliptically skewed distributions and their application to robust cluster analysis. IEEE Transactions on Signal Processing, 69:3947–3962.
- Importance of input data normalization for the application of neural networks to complex industrial problems. IEEE Transactions on Nuclear Science, 44:1464 – 1468.
- Data normalization for bilinear structures in high-frequency financial time-series. In 2020 25th International Conference on Pattern Recognition (ICPR), pages 7287–7292.
- A new family of power transformations to improve normality or symmetry. Biometrika, 87(4):954–959.
- Wind power prediction based on outlier correction, ensemble reinforcement learning, and residual correction. Energy, 250:123857.
- Normalization effects on deep neural networks. Foundations of Data Science, 5(3):389–465.