Warpformer: A Multi-scale Modeling Approach for Irregular Clinical Time Series (2306.09368v1)
Abstract: Irregularly sampled multivariate time series are ubiquitous in various fields, particularly in healthcare, and exhibit two key characteristics: intra-series irregularity and inter-series discrepancy. Intra-series irregularity refers to the fact that time-series signals are often recorded at irregular intervals, while inter-series discrepancy refers to the significant variability in sampling rates among diverse series. However, recent advances in irregular time series have primarily focused on addressing intra-series irregularity, overlooking the issue of inter-series discrepancy. To bridge this gap, we present Warpformer, a novel approach that fully considers these two characteristics. In a nutshell, Warpformer has several crucial designs, including a specific input representation that explicitly characterizes both intra-series irregularity and inter-series discrepancy, a warping module that adaptively unifies irregular time series in a given scale, and a customized attention module for representation learning. Additionally, we stack multiple warping and attention modules to learn at different scales, producing multi-scale representations that balance coarse-grained and fine-grained signals for downstream tasks. We conduct extensive experiments on widely used datasets and a new large-scale benchmark built from clinical databases. The results demonstrate the superiority of Warpformer over existing state-of-the-art approaches.
- Layer Normalization. ArXiv abs/1607.06450 (2016).
- Donald J Berndt and James Clifford. 1994. Using dynamic time warping to find patterns in time series.. In KDD workshop.
- (k, l)-Medians Clustering of Trajectories Using Continuous Dynamic Time Warping. In SIGSPATIAL.
- Language Models are Few-Shot Learners. In NeurIPS 2020.
- Recurrent neural networks for multivariate time series with missing values. Scientific reports 8, 1 (2018), 1–12.
- Neural Ordinary Differential Equations. In NeurIPS. 6572–6583.
- Marco Cuturi and Mathieu Blondel. 2017. Soft-dtw: a differentiable loss function for time-series. In International conference on machine learning. PMLR, 894–903.
- Transformer-xl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860 (2019).
- Speech-transformer: a no-recurrence sequence-to-sequence model for speech recognition. In ICASSP. IEEE, 5884–5888.
- Predicting intervention onset in the ICU with switching state space models. AMIA Summits on Translational Science Proceedings 2017 (2017), 82 – 91.
- Multitask learning and benchmarking with clinical time series data. Scientific data 6, 1 (2019), 1–18.
- Machine learning–based model for prediction of outcomes in acute stroke. Stroke 50, 5 (2019), 1263–1265.
- Set Functions for Time Series. In ICML, Vol. 119. 4353–4363.
- Mining electronic health records: towards better research applications and clinical care. Nature Reviews Genetics 13, 6 (2012), 395–405.
- MIMIC-III, a freely accessible critical care database. Scientific data 3, 1 (2016), 1–9.
- Learnable dynamic temporal pooling for time series classification. In AAAI.
- Predicting Path Failure In Time-Evolving Graphs. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
- TATC: predicting Alzheimer’s disease with actigraphy data. In SIGKDD. 509–518.
- Neural Speech Synthesis with Transformer Network. In AAAI.
- Pyraformer: Low-Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting. In ICLR.
- Human Mobility Modeling During the COVID-19 Pandemic via Deep Graph Diffusion Infomax. In AAAI.
- Temporal transformer networks: Joint learning of invariant and discriminative time warping. In CVPR. 12426–12435.
- Hitanet: Hierarchical time-aware attention networks for risk prediction on electronic health records. In SIGKDD. 647–656.
- AdaCare: Explainable Clinical Health Status Representation Learning via Scale-Adaptive Feature Extraction and Recalibration. In AAAI. 825–832.
- A comprehensive EHR timeseries pre-training benchmark. In CHIL. 257–278.
- Discrete Event, Continuous Time RNNs. CoRR abs/1710.04110 (2017). arXiv:1710.04110 http://arxiv.org/abs/1710.04110
- Machine Learning–Based Early Warning Systems for Clinical Deterioration: Systematic Scoping Review. Journal of medical Internet research 23, 2 (2021), e25187.
- Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences. In NueIPS. 3882–3890.
- A deep active survival analysis approach for precision treatment recommendations: Application of prostate cancer. Expert Systems with Applications 115 (2019), 16–26.
- A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. In ICLR.
- SyncTwin: Treatment Effect Estimation with Longitudinal Outcomes. NeurIPS 34 (2021).
- Latent Ordinary Differential Equations for Irregularly-Sampled Time Series. In NeurIPS. 5321–5331.
- Hiroaki Sakoe and Seibi Chiba. 1978. Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 26 (1978), 159–165.
- Differentiable Segmentation of Sequences. In ICLR.
- Scaleformer: Iterative Multi-scale Refining Transformers for Time Series Forecasting. In ICLR.
- Self-Attention with Relative Position Representations. In ACL.
- Satya Narayan Shukla and Benjamin Marlin. 2021. Multi-Time Attention Networks for Irregularly Sampled Time Series. In ICLR.
- Satya Narayan Shukla and Benjamin M. Marlin. 2019. Interpolation-Prediction Networks for Irregularly Sampled Time Series. In ICLR.
- Predicting in-hospital mortality of icu patients: The physionet/computing in cardiology challenge 2012. In 2012 Computing in Cardiology. IEEE, 245–248.
- The evolved transformer. In ICML. PMLR, 5877–5886.
- Clinical intervention prediction and understanding with deep neural networks. In Machine Learning for Healthcare Conference. 322–337.
- Treatment Recommendations for COVID-19 Patients along with Robust Explanations. In CBMS. 207–212.
- SINDHU TIPIRNENI and CHANDAN K REDDY. 2022. Self-Supervised Transformer for Sparse and Irregularly Sampled Multivariate Clinical Time-Series. ACM Trans. Knowl. Discov. Data 1, 1 (2022).
- Attention is all you need. In NeurIPS. 5998–6008.
- On Position Embeddings in BERT. In ICLR.
- Learning Deep Transformer Models for Machine Translation. In ACL.
- MIMIC-Extract: a data extraction, preprocessing, and representation pipeline for MIMIC-III. In CHIL. 222–235.
- Diffeomorphic Temporal Alignment Nets. In Neural Information Processing Systems.
- Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting. In NeurIPS.
- Understanding vasopressor intervention and weaning: risk prediction in a public heterogeneous clinical time series database. Journal of the American Medical Informatics Association 24, 3 (2017), 488–495.
- Raim: Recurrent attentive and intensive model of multimodal patient monitoring data. In SIGKDD. 2565–2573.
- MUFASA: Multimodal Fusion Architecture Search for Electronic Health Records. In AAAI. 10532–10540.
- Lsan: Modeling long-term dependencies and short-term correlations with hierarchical attention for risk prediction. In CIKM. 1753–1762.
- Artificial intelligence in healthcare. Nature biomedical engineering 2, 10 (2018), 719–731.
- Less is more: Fast multivariate time series forecasting with light sampling-oriented mlp structures. arXiv preprint arXiv:2207.01186 (2022).
- Learning Robust Patient Representations from Multi-modal Electronic Health Records: A Supervised Deep Learning Approach. In SDM. 585–593.
- Graph-Guided Network For Irregularly Sampled Multivariate Time Series. In ICLR.
- Time adaptive optimal transport: A framework of time series similarity measure. IEEE Access 8 (2020), 149764–149774.
- Development of an early-warning system for high-risk patients for suicide attempt using deep learning and electronic health records. Translational psychiatry 10, 1 (2020), 1–10.
- Informer: Beyond efficient transformer for long sequence time-series forecasting. In AAAI.
- Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In ICML. 27268–27286.