UniTS: A Unified Multi-Task Time Series Model (2403.00131v3)
Abstract: Although pre-trained transformers and reprogrammed text-based LLMs have shown strong performance on time series tasks, the best-performing architectures vary widely across tasks, with most models narrowly focused on specific areas, such as time series forecasting. Unifying predictive and generative time series tasks within a single model remains challenging. We introduce UniTS, a unified multi-task time series model that utilizes task tokenization to integrate predictive and generative tasks into a single framework. UniTS employs a modified transformer block to capture universal time series representations, enabling transferability from a heterogeneous, multi-domain pre-training dataset-characterized by diverse dynamic patterns, sampling rates, and temporal scales-to a wide range of downstream datasets with varied task specifications and data domains. Tested on 38 datasets across human activity sensors, healthcare, engineering, and finance, UniTS achieves superior performance compared to 12 forecasting models, 20 classification models, 18 anomaly detection models, and 16 imputation models, including adapted text-based LLMs. UniTS also demonstrates strong few-shot and prompt capabilities when applied to new domains and tasks. In single-task settings, UniTS outperforms competitive task-specialized time series models. Code and datasets are available at https://github.com/mims-harvard/UniTS.
- Ask me anything: A simple strategy for prompting language models. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=bhUPJnS2g0X.
- Tactis-2: Better, faster, simpler attentional copulas for multivariate time series. In International conference on learning representations, 2024.
- The uea multivariate time series classification archive, 2018. arXiv preprint arXiv:1811.00075, 2018.
- Spoken Arabic Digit. UCI Machine Learning Repository, 2010. DOI: https://doi.org/10.24432/C52C9Q.
- A spelling device for the paralysed. Nature, 398(6725):297–298, 1999.
- On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- TEMPO: Prompt-based generative pre-trained transformer for time series forecasting. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=YH5w12OUuU.
- CDC. Illness. URL https://gis.cdc.gov/grasp/fluview/fluportaldashboard.html.
- Llm4ts: Two-stage fine-tuning for time-series forecasting with pre-trained llms. arXiv preprint arXiv:2308.08469, 2023.
- PLOT: Prompt learning with optimal transport for vision-language models. In The Eleventh International Conference on Learning Representations, 2023a. URL https://openreview.net/forum?id=zqwryBoXYnh.
- Adversarial autoencoder for unsupervised time series anomaly detection and interpretation. In Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, pp. 267–275, 2023b.
- Provably convergent schr\\\backslash\” odinger bridge with applications to probabilistic time series imputation. In International Conference on Machine Learning, 2023c.
- Contiformer: Continuous-time transformer for irregular time series modeling. In Thirty-seventh Conference on Neural Information Processing Systems, 2023d. URL https://openreview.net/forum?id=YJDz4F2AZu.
- A brain-computer interface for controlling iot devices using eeg signals. In 2021 IEEE Fifth Ecuador Technical Chapters Meeting (ETCM), pp. 1–6. IEEE, 2021.
- Cuturi, M. Fast global alignment kernels. In Proceedings of the 28th international conference on machine learning (ICML-11), pp. 929–936, 2011.
- The ucr time series classification archive, October 2018. https://www.cs.ucr.edu/~eamonn/time_series_data_2018/.
- Mst-gat: A multimodal spatial–temporal graph attention network for time series anomaly detection. Information Fusion, 89:527–536, 2023.
- Simmtm: A simple pre-training framework for masked time-series modeling. arXiv preprint arXiv:2302.00861, 2023.
- Emopain challenge 2020: Multimodal pain evaluation from facial and bodily expressions. In 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), pp. 849–856. IEEE, 2020.
- T-rep: Representation learning for time series using time-embeddings. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=3y2TfP966N.
- Res2net: A new multi-scale backbone architecture. IEEE transactions on pattern analysis and machine intelligence, 43(2):652–662, 2019.
- Editanything: Empowering unparalleled flexibility in image editing and generation. In Proceedings of the 31st ACM International Conference on Multimedia, pp. 9414–9416, 2023.
- Monash time series forecasting archive. In Neural Information Processing Systems Track on Datasets and Benchmarks, 2021.
- Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. circulation, 101(23):e215–e220, 2000a.
- PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation, 101(23):e215–e220, 2000b. Circulation Electronic Pages: http://circ.ahajournals.org/content/101/23/e215.full PMID:1085218; doi: 10.1161/01.CIR.101.23.e215.
- Large language models are zero-shot time series forecasters. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- Domain adaptation for time series under feature and label shifts. In Proceedings of the 40th International Conference on Machine Learning, ICML’23. JMLR.org, 2023.
- Masked autoencoders are scalable vision learners. arXiv:2111.06377, 2021.
- A parametric empirical bayesian framework for the eeg/meg inverse problem: generative models for multi-subject and multi-modal integration. Frontiers in human neuroscience, 5:76, 2011.
- PRODIGY: Enabling in-context learning over graphs. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=pLwYhNNnoR.
- Hyndman, R. expsmooth: Data sets from “forecasting with exponential smoothing”. R package version, 2, 2015.
- Forecasting: principles and practice. OTexts, 2018.
- Time-llm: Time series forecasting by reprogramming large language models. arXiv preprint arXiv:2310.01728, 2023.
- Climateset: A large-scale climate model dataset for machine learning. In Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2023. URL https://openreview.net/forum?id=3z9YV29Ogn.
- Probabilistic imputation for time-series classification with missing data. In International Conference on Machine Learning, pp. 16654–16667. PMLR, 2023.
- Segment anything. arXiv preprint arXiv:2304.02643, 2023.
- Multidimensional curve classification using passing-through regions. Pattern Recognition Letters, 20(11-13):1103–1111, 1999.
- Modeling long-and short-term temporal patterns with deep neural networks. In The 41st international ACM SIGIR conference on research & development in information retrieval, pp. 95–104, 2018.
- Learning to embed time series patches independently. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=WS7GuBDFa2.
- The power of scale for parameter-efficient prompt tuning. In Moens, M.-F., Huang, X., Specia, L., and Yih, S. W.-t. (eds.), Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 3045–3059, Online and Punta Cana, Dominican Republic, November 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.emnlp-main.243. URL https://aclanthology.org/2021.emnlp-main.243.
- Deep learning for anomaly detection in multivariate time series: Approaches, applications, and challenges. Information Fusion, 91:93–102, 2023.
- Prefix-tuning: Optimizing continuous prompts for generation. In Zong, C., Xia, F., Li, W., and Navigli, R. (eds.), Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 4582–4597, Online, August 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.acl-long.353. URL https://aclanthology.org/2021.acl-long.353.
- Classification of household devices by electricity usage profiles. In Intelligent Data Engineering and Automated Learning-IDEAL 2011: 12th International Conference, Norwich, UK, September 7-9, 2011. Proceedings 12, pp. 403–412. Springer, 2011.
- An open access database for the evaluation of heart sound algorithms. Physiological measurement, 37(12):2181, 2016.
- Visual instruction tuning. In Advances in neural information processing systems, 2023a.
- uwave: Accelerometer-based personalized gesture recognition and its applications. Pervasive and Mobile Computing, 5(6):657–675, 2009.
- Pyraformer: Low-complexity pyramidal attention for long-range time series modeling and forecasting. In International conference on learning representations, 2021.
- P-tuning: Prompt tuning can be comparable to fine-tuning across scales and tasks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 61–68, 2022a.
- Koopa: Learning non-stationary time series dynamics with koopman predictors. In Advances in neural information processing systems, 2023b.
- itransformer: Inverted transformers are effective for time series forecasting. In International Conference on Learning Representations, 2024.
- A convnet for the 2020s. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11976–11986, 2022b.
- Scale-teaching: Robust multi-scale training for time series classification with noisy labels. In Thirty-seventh Conference on Neural Information Processing Systems, 2023c. URL https://openreview.net/forum?id=9D0fELXbrg.
- Out-of-distribution representation learning for time series classification. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=gUZWOE42l6Q.
- Time series contrastive learning with information-aware augmentations. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pp. 4534–4542, 2023.
- Optimal deseasonalization for monthly and daily geophysical time series. Journal of Environmental Statistics, 2012.
- Web traffic time series forecasting, 2017. URL https://kaggle.com/competitions/web-traffic-time-series-forecasting.
- Mobile sensor data anonymization. In Proceedings of the international conference on internet of things design and implementation, pp. 49–58, 2019.
- Bake off redux: a review and experimental evaluation of recent time series classification algorithms. arXiv preprint arXiv:2304.13029, 2023.
- Generative modeling of regular and irregular time series data via koopman vaes. International conference on learning representations, 2024.
- Large Language Models Are Zero Shot Time Series Forecasters. In Advances in Neural Information Processing Systems, 2023.
- A time series is worth 64 words: Long-term forecasting with transformers. In International Conference on Learning Representations, 2023.
- NREL. Solar power data for integration studies. URL https://www.nrel.gov/grid/solar-power-data.html.
- Olszewski, R. T. Generalized feature extraction for structural pattern recognition in time-series data. Carnegie Mellon University, 2001.
- PeMS. Traffic. URL http://pems.dot.ca.gov/.
- A review of generalized zero-shot learning methods. IEEE transactions on pattern analysis and machine intelligence, 2022.
- Encoding time-series explanations through self-supervised model behavior consistency. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=yEfmhgwslQ.
- Language models are unsupervised multitask learners. 2019.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pp. 8748–8763. PMLR, 2021.
- Lag-llama: Towards foundation models for time series forecasting. arXiv preprint arXiv:2310.08278, 2023.
- Finding anomalous periodic time series: An application to catalogs of periodic variable stars. Machine learning, 74:281–313, 2009.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10684–10695, 2022.
- Roverso, D. Plant diagnostics by transient classification: The aladdin approach. International Journal of Intelligent Systems, 17(8):767–790, 2002.
- Generalizing dtw to the multi-dimensional case requires an adaptive approach. Data mining and knowledge discovery, 31:1–31, 2017.
- Noninvasive fetal ecg: the physionet/computing in cardiology challenge 2013. In Computing in cardiology 2013, pp. 149–152. IEEE, 2013.
- A review and comparison of strategies for multi-step ahead time series forecasting based on the nn5 forecasting competition. Expert systems with applications, 39(8):7067–7083, 2012.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
- Trindade, A. ElectricityLoadDiagrams20112014. UCI Machine Learning Repository, 2015. DOI: https://doi.org/10.24432/C58C86.
- Universal time-series representation learning: A survey. arXiv preprint arXiv:2401.03717, 2024.
- Generalized models for the classification of abnormal movements in daily life and its applicability to epilepsy convulsion recognition. International journal of neural systems, 26(06):1650037, 2016.
- Micn: Multi-scale local and global context modeling for long-term series forecasting. In The Eleventh International Conference on Learning Representations, 2022.
- Generalizing from a few examples: A survey on few-shot learning. ACM computing surveys (csur), 53(3):1–34, 2020.
- Contrast everything: A hierarchical contrastive framework for medical time-series. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=sOQBHlCmzp.
- Wetterstation. Weather. URL https://www.bgc-jena.mpg.de/wetter/.
- Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34:22419–22430, 2021.
- Timesnet: Temporal 2d-variation modeling for general time series analysis. In International Conference on Learning Representations, 2023.
- Dynamic sparse network for time series classification: Learning what to “see”. In Oh, A. H., Agarwal, A., Belgrave, D., and Cho, K. (eds.), Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=ZxOO5jfqSYw.
- Retrieval-based reconstruction for time-series contrastive learning. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=3zQo5oUvia.
- Are transformers effective for time series forecasting? In Proceedings of the AAAI conference on artificial intelligence, volume 37, pp. 11121–11128, 2023.
- A transformer-based framework for multivariate time series representation learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, KDD ’21, pp. 2114–2124, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450383325. doi: 10.1145/3447548.3467401.
- GLIPv2: Unifying localization and vision-language understanding. In Oh, A. H., Agarwal, A., Belgrave, D., and Cho, K. (eds.), Advances in Neural Information Processing Systems, 2022a. URL https://openreview.net/forum?id=wiBEFdAvl8L.
- Graph-guided network for irregularly sampled multivariate time series. In International Conference on Learning Representations, ICLR, 2022b.
- Self-supervised contrastive pre-training for time series via time-frequency consistency. Advances in Neural Information Processing Systems, 2022c.
- A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering, 34(12):5586–5609, 2021.
- Meta-transformer: A unified framework for multimodal learning. arXiv preprint arXiv:2307.10802, 2023.
- Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pp. 11106–11115, 2021.
- Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In International Conference on Machine Learning, pp. 27268–27286. PMLR, 2022.
- One fits all: Power general time series analysis by pretrained LM. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=gMS6FVZvmF.
- Uni-Perceiver-MoE: Learning sparse generalist models with conditional moes. Advances in Neural Information Processing Systems, 35:2664–2678, 2022.