TiRex: Zero-Shot Forecasting Across Long and Short Horizons with Enhanced In-Context Learning (2505.23719v1)

Published 29 May 2025 in cs.LG

Abstract: In-context learning, the ability of LLMs to perform tasks using only examples provided in the prompt, has recently been adapted for time series forecasting. This paradigm enables zero-shot prediction, where past values serve as context for forecasting future values, making powerful forecasting tools accessible to non-experts and increasing the performance when training data are scarce. Most existing zero-shot forecasting approaches rely on transformer architectures, which, despite their success in language, often fall short of expectations in time series forecasting, where recurrent models like LSTMs frequently have the edge. Conversely, while LSTMs are well-suited for time series modeling due to their state-tracking capabilities, they lack strong in-context learning abilities. We introduce TiRex that closes this gap by leveraging xLSTM, an enhanced LSTM with competitive in-context learning skills. Unlike transformers, state-space models, or parallelizable RNNs such as RWKV, TiRex retains state-tracking, a critical property for long-horizon forecasting. To further facilitate its state-tracking ability, we propose a training-time masking strategy called CPM. TiRex sets a new state of the art in zero-shot time series forecasting on the HuggingFace benchmarks GiftEval and Chronos-ZS, outperforming significantly larger models including TabPFN-TS (Prior Labs), Chronos Bolt (Amazon), TimesFM (Google), and Moirai (Salesforce) across both short- and long-term forecasts.

Summary

The paper introduces TiRex, a novel zero-shot forecasting model that leverages xLSTM blocks for effective state-tracking over long horizons.
It employs a unique Contiguous Patch Masking strategy to mitigate error propagation in multi-step predictions.
TiRex demonstrates superior performance on benchmarks like GiftEval, offering robust predictions across diverse real-world scenarios.

TiRex: Zero-Shot Forecasting Across Long and Short Horizons with Enhanced In-Context Learning

This essay discusses the implementation and practical application of TiRex, a novel time series forecasting model designed for zero-shot prediction using enhanced in-context learning capabilities.

Introduction to TiRex

TiRex is an innovative forecasting model leveraging the xLSTM architecture to perform zero-shot predictions on time series data. Unlike transformers, which are predominant in language processing but fall short in time series forecasting, TiRex combines the strengths of LSTMs with advanced in-context learning capabilities. This unique approach aims to bridge the gap by providing robust state-tracking suitable for long-horizon forecasts.

Key Components of TiRex

xLSTM Architecture: TiRex uses xLSTM blocks designed to maintain state-tracking capabilities critical for time-dependent data. This architecture allows TiRex to outperform larger transformer-based models without sacrificing efficiency.
Contiguous Patch Masking (CPM): A training-time strategy developed to enhance multi-step forecasting by managing error propagation over long horizons, improving the model's uncertainty estimates.
Data Augmentation: Introduction of novel augmentation techniques tailored for time-series data, crucial for enhancing the model's robustness and generalization ability.
Figure 1: Architecture overview of TiRex. The model demonstrates its dual xLSTM block architecture with input and output residual layers ensuring maintained state-tracking.

Implementation Details

Implementing TiRex requires understanding both its architectural design and training strategies. The xLSTM blocks form the backbone, characterized by efficient state tracking. CPM ensures the model can handle long sequences through controlled masking, maintaining coherency in predictions over numerous steps.

Training and Evaluation

TiRex is trained on a diverse corpus, incorporating synthetic and real data from various domains, with augmentations increasing data variability. The training employs a learning rate scheduler and optimization techniques like AdamW, vital for managing the model's performance across extensive datasets.

Performance Analysis

TiRex has demonstrated state-of-the-art performance on benchmarks like GiftEval and Chronos-ZS. Its zero-shot capabilities are especially notable in datasets where data overlap between training and benchmarking is minimized.

Figure 2: Results of the GiftEval-ZS benchmark illustrate TiRex's superior aggregated scores compared to other zero-shot models.

Practical Considerations

Scalability and Efficiency

TiRex's architecture is optimized for scalability, ensuring efficient memory and computational use without compromising prediction accuracy. Its flexibility allows it to be deployed in various environments, from local systems to cloud-based architectures.

Application Domains

TiRex is applicable across multiple domains, including energy, healthcare, and retail, where accurate long-term predictions are essential. Its ability to provide coherent uncertainty estimates makes it valuable for decision-making processes that rely on forecasting under uncertainty.

Trade-offs and Limitations

While TiRex excels in generalization and zero-shot learning, the finite pre-training corpus may limit its performance on highly domain-specific tasks. Future improvements could focus on expanding the diversity of training data and evaluating multi-variate series integration.

Conclusion

TiRex represents a significant advance in zero-shot time series forecasting by addressing the deficiencies of transformer models in this domain. By leveraging the xLSTM architecture and innovative training strategies such as CPM, TiRex sets a new performance standard across various evaluation benchmarks.

Future iterations of TiRex could see enhancements in handling multivariate time series and potentially adopting more sophisticated augmentation strategies to further boost its generalization and adaptability across even broader application areas.