TOTEM: TOkenized Time Series EMbeddings for General Time Series Analysis
Abstract: This work studies the problem of time series analysis with generalist (or foundation) models, which are models trained across many data domains. Drawing inspiration from the widespread success of LLMs, we consider the simple strategy of discretely tokenizing time series data drawn from a myriad of datasets via self-supervision, then using the fixed tokenization to solve a variety of tasks across many data domains. Canonically, time series models are either trained on a single dataset or built in a task-specific manner (e.g., a forecasting-only model), where many use patches of time as inputs to the model. As such, performant generalist, discrete representation time series models explored across many tasks are of value. Our method, TOkenized Time Series EMbeddings (TOTEM), produces such generalist time series models with minimal or no fine-tuning while exhibiting strong zero-shot performance. We evaluate TOTEM extensively over nearly 500 experiments on three commonly-studied time series tasks with real-world data: imputation (17 baselines, 12 datasets), anomaly detection (19 baselines, 25 datasets), and forecasting (14 baselines, 12 datasets). We conclude that TOTEM matches or outperforms existing state-of-the-art models in both the canonical specialist setting (i.e., training one model on one domain) as well as the generalist setting (i.e., training a single model on many domains), which demonstrates the efficacy of tokenization for general time series analysis. The open-source implementation is available here: https://github.com/SaberaTalukder/TOTEM; a video summary is available here: https://www.youtube.com/watch?v=OqrCpdb6MJk.
- Anderson, O. D. Time-series. 2nd edn., 1976.
- An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271, 2018.
- On the benefits of early fusion in multimodal representation learning. arXiv preprint arXiv:2011.07191, 2020.
- Nhits: Neural hierarchical interpolation for time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pp. 6989–6997, 2023.
- Tsmixer: An all-mlp architecture for time series forecasting. arXiv preprint arXiv:2303.06053, 2023.
- Long-term forecasting with tide: Time-series dense encoder. arXiv preprint arXiv:2304.08424, 2023.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
- Dewave: Discrete eeg waves encoding for brain dynamics to text translation. arXiv preprint arXiv:2309.14030, 2023.
- Taming transformers for high-resolution image synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12873–12883, 2021.
- Unsupervised scalable representation learning for multivariate time series. Advances in neural information processing systems, 32, 2019.
- Gage, P. A new algorithm for data compression. C Users Journal, 12(2):23–38, 1994.
- Monash time series forecasting archive. In Neural Information Processing Systems Track on Datasets and Benchmarks, 2021.
- Temporal convolutional networks for anomaly detection in time series. In Journal of Physics: Conference Series, volume 1213, pp. 042050. IOP Publishing, 2019.
- Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
- Holt, C. C. Forecasting trends and seasonals by exponentially weighted moving averages. ONR Memorandum, 52(52):5–10, 1957.
- Forecasting: principles and practice. OTexts, 2018.
- Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations, 2021.
- Reformer: The efficient transformer. arXiv preprint arXiv:2001.04451, 2020.
- Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. Advances in neural information processing systems, 32, 2019.
- Revisiting long-term time series forecasting: An investigation on linear mapping. arXiv preprint arXiv:2305.10721, 2023.
- Scinet: Time series modeling and forecasting with sample convolution and interaction. Advances in Neural Information Processing Systems, 35:5816–5828, 2022a.
- Pyraformer: Low-complexity pyramidal attention for long-range time series modeling and forecasting. In International conference on learning representations, 2021.
- Non-stationary transformers: Exploring the stationarity in time series forecasting. Advances in Neural Information Processing Systems, 35:9881–9893, 2022b.
- itransformer: Inverted transformers are effective for time series forecasting. arXiv preprint arXiv:2310.06625, 2023.
- Multivariate time series imputation with generative adversarial networks. Advances in neural information processing systems, 31, 2018.
- E2gan: End-to-end generative adversarial network for multivariate time series imputation. In Proceedings of the 28th international joint conference on artificial intelligence, pp. 3094–3100. AAAI Press Palo Alto, CA, USA, 2019.
- A time series is worth 64 words: Long-term forecasting with transformers. arXiv preprint arXiv:2211.14730, 2022.
- Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499, 2016.
- N-beats: Neural basis expansion analysis for interpretable time series forecasting. arXiv preprint arXiv:1905.10437, 2019.
- Ajile12: Long-term naturalistic human intracranial neural recordings and pose. Scientific data, 9(1):184, 2022.
- Improving language understanding by generative pre-training. 2018.
- Vq-tr: Vector quantized attention for time series forecasting. 2022a.
- Vq-ar: Vector quantized autoregressive probabilistic time series forecasting. arXiv preprint arXiv:2205.15894, 2022b.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10684–10695, 2022.
- Deepar: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting, 36(3):1181–1191, 2020.
- Timeseries anomaly detection using temporal hierarchical one-class network. Advances in Neural Information Processing Systems, 33:13016–13026, 2020.
- Deep neural imputation: A framework for recovering incomplete brain recordings. arXiv preprint arXiv:2206.08094, 2022.
- Forecasting at scale. The American Statistician, 72(1):37–45, 2018.
- Unsupervised representation learning for time series with temporal neighborhood coding. arXiv preprint arXiv:2106.00750, 2021.
- Neural discrete representation learning. Advances in neural information processing systems, 30, 2017.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Winters, P. R. Forecasting sales by exponentially weighted moving averages. Management science, 6(3):324–342, 1960.
- Etsformer: Exponential smoothing transformers for time-series forecasting. arXiv preprint arXiv:2202.01381, 2022.
- Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34:22419–22430, 2021.
- Timesnet: Temporal 2d-variation modeling for general time series analysis. arXiv preprint arXiv:2210.02186, 2022a.
- Flowformer: Linearizing transformers with conservation flows. arXiv preprint arXiv:2202.06258, 2022b.
- Anomaly transformer: Time series anomaly detection with association discrepancy. arXiv preprint arXiv:2110.02642, 2021.
- Unsupervised time-series representation learning with iterative bilinear temporal-spectral fusion. In International Conference on Machine Learning, pp. 25038–25054. PMLR, 2022.
- Ts2vec: Towards universal representation of time series. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp. 8980–8987, 2022.
- Are transformers effective for time series forecasting? In Proceedings of the AAAI conference on artificial intelligence, volume 37, pp. 11121–11128, 2023.
- Less is more: Fast multivariate time series forecasting with light sampling-oriented mlp structures. arXiv preprint arXiv:2207.01186, 2022.
- Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting. In The Eleventh International Conference on Learning Representations, 2022.
- Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pp. 11106–11115, 2021.
- Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In International Conference on Machine Learning, pp. 27268–27286. PMLR, 2022.
- One fits all: Power general time series analysis by pretrained lm. arXiv preprint arXiv:2302.11939, 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.