FocusLearn: Fully-Interpretable, High-Performance Modular Neural Networks for Time Series (2311.16834v4)
Abstract: Multivariate time series have many applications, from healthcare and meteorology to life science. Although deep learning models have shown excellent predictive performance for time series, they have been criticised for being "black-boxes" or non-interpretable. This paper proposes a novel modular neural network model for multivariate time series prediction that is interpretable by construction. A recurrent neural network learns the temporal dependencies in the data while an attention-based feature selection component selects the most relevant features and suppresses redundant features used in the learning of the temporal dependencies. A modular deep network is trained from the selected features independently to show the users how features influence outcomes, making the model interpretable. Experimental results show that this approach can outperform state-of-the-art interpretable Neural Additive Models (NAM) and variations thereof in both regression and classification of time series tasks, achieving a predictive performance that is comparable to the top non-interpretable methods for time series, LSTM and XGBoost.
- Neural additive models: Interpretable machine learning with neural nets. Advances in Neural Information Processing Systems, 34, 2021.
- S.O. Arik and T. Pfister. Tabnet: Attentive interpretable tabular learning. Proceedings of the AAAI Conference on Artificial Intelligence, 35:6679–6687, 8 2021.
- Layer normalization. arXiv:1607.06450, 7 2016.
- Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5:157–166, 3 1994.
- Modularity in neural computing. Proceedings of the IEEE, 87:1497–1518, 1999.
- A multiattention-based supervised feature selection method for multivariate time series. Computational Intelligence and Neuroscience, 2021:1–10, 7 2021.
- Detecting the causality influence of individual meteorological factors on local pm2.5 concentration in the jing-jin-ji region. Scientific Reports, 7:40735, 1 2017.
- On the properties of neural machine translation: Encoder-decoder approaches. arXiv: 1409.1259, 9 2014.
- Prediction of idh genotype in gliomas with dynamic susceptibility contrast perfusion mr imaging using an explainable recurrent neural network. Neuro-Oncology, 21:1197–1209, 9 2019.
- The everyday acoustic environment and its association with human heart rate: evidence from real-world data logging with hearing aids and wearables. Royal Society Open Science, 8:rsos.201345, 2 2021.
- Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555, 12 2014.
- Scalable interpretability via polynomials. arXiv:2205.14108, 5 2022.
- J.L. Elman. Finding structure in time. Cognitive science, 14:179–211, 1990.
- Efficient learning interpretable shapelets for accurate time series classification. In 2018 IEEE 34th International Conference on Data Engineering (ICDE), pages 497–508. IEEE, 4 2018.
- Forecasting crude oil price using kalman filter based on the reconstruction of modes of decomposition ensemble model. IEEE Access, 7:149908–149925, 2019.
- An interpretable icu mortality prediction model based on logistic regression and recurrent neural networks with lstm units. AMIA … Annual Symposium proceedings. AMIA Symposium, 2018:460–469, 2018.
- X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks. In Yee Whye Teh and Mike Titterington, editors, Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pages 249–256. {PMLR},, 5 2010.
- Afs: An attention-based mechanism for supervised feature selection. Proceedings of the AAAI Conference on Artificial Intelligence, 33:3705–3713, 7 2019.
- T. Hastie and R. Tibshirani. Generalized additive models. Statistical Science, 1, 8 1986.
- S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9:1735–1780, 1997.
- Deep learning with long short-term memory for time series prediction. IEEE Communications Magazine, 57:114–119, 2019.
- Motor fault detection and feature extraction using rnn-based variational autoencoder. IEEE Access, 7, 2019.
- A system for massively parallel hyperparameter tuning. arXiv:1810.05934, 10 2018.
- Temporal fusion transformers for interpretable multi-horizon time series forecasting. International Journal of Forecasting, 37:1748–1764, 10 2021.
- On the variance of the adaptive learning rate and beyond. arXiv:1908.03265v4, 8 2019.
- I. Loshchilov and F. Hutter. Sgdr: Stochastic gradient descent with warm restarts. arXiv:1608.03983, 8 2016.
- Accurate intelligible models with pairwise interactions. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 623–631. ACM, 8 2013.
- R.P. Paiva and A. Dourado. Interpretability and learning in neuro-fuzzy systems. Fuzzy Sets and Systems, 147:17–38, 10 2004.
- Interpretable multivariate time series forecasting with temporal attention convolutional neural networks. In 2020 IEEE Symposium Series on Computational Intelligence (SSCI), pages 1687–1694. IEEE, 12 2020.
- Neural basis models for interpretability. arXiv:2205.14120, 5 2022.
- "why should i trust you?": Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1135–1144. Association for Computing Machinery, 2016.
- Oliver Roesler. Eeg eye state, 2013.
- Explainable artificial intelligence (xai) on timeseries data: A survey. arXiv preprint arXiv:2104.00950, 4 2021.
- Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In Proceedings of the Annual Conference of International Speech Communication Association (INTERSPEECH), 2014.
- M. Schuster and K.K. Paliwal. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45:2673–2681, 1997.
- P. Senin and S. Malinchik. Sax-vsm: Interpretable time series classification using sax and vector space model. In 2013 IEEE 13th International Conference on Data Mining, pages 1175–1180. IEEE, 12 2013.
- Deep learning-based stock price prediction using lstm and bi-directional lstm model. In 2nd Novel Intelligent and Leading Emerging Sciences Conference (NILES), pages 87–92, 2020.
- Neural interaction transparency (nit): Disentangling learned interactions for improved interpretability. Advances in Neural Information Processing Systems, 31, 2018.
- S. Varela-Santos and P. Melin. A new modular neural network approach with fuzzy response integration for lung disease classification based on multiple objective feature optimization in chest x-ray images. Expert Systems with Applications, 168:114361, 4 2021.
- Attention is all you need. arXiv:1706.03762, 6 2017.
- Time series classification from scratch with deep neural networks: A strong baseline. In 2017 International Joint Conference on Neural Networks (IJCNN), pages 1578–1585. IEEE, 5 2017.
- L. Ye and E. Keogh. Time series shapelets. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’09, page 947. ACM Press, 2009.
- A temporal fusion transformer for short-term freeway traffic speed multistep prediction. Neurocomputing, 500:329–340, 8 2022.
- P.G. Zhang. Time series forecasting using a hybrid arima and neural network model. Neurocomputing, 50:159–175, 1 2003.
- Cautionary tales on air-quality improvement in beijing. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 473:20170457, 9 2017.
- Interpretable temporal attention network for covid-19 forecasting. Applied Soft Computing, 120:108691, 5 2022.