Improving Position Encoding of Transformers for Multivariate Time Series Classification (2305.16642v1)

Published 26 May 2023 in cs.LG and cs.CV

Abstract: Transformers have demonstrated outstanding performance in many applications of deep learning. When applied to time series data, transformers require effective position encoding to capture the ordering of the time series data. The efficacy of position encoding in time series analysis is not well-studied and remains controversial, e.g., whether it is better to inject absolute position encoding or relative position encoding, or a combination of them. In order to clarify this, we first review existing absolute and relative position encoding methods when applied in time series classification. We then proposed a new absolute position encoding method dedicated to time series data called time Absolute Position Encoding (tAPE). Our new method incorporates the series length and input embedding dimension in absolute position encoding. Additionally, we propose computationally Efficient implementation of Relative Position Encoding (eRPE) to improve generalisability for time series. We then propose a novel multivariate time series classification (MTSC) model combining tAPE/eRPE and convolution-based input encoding named ConvTran to improve the position and data embedding of time series data. The proposed absolute and relative position encoding methods are simple and efficient. They can be easily integrated into transformer blocks and used for downstream tasks such as forecasting, extrinsic regression, and anomaly detection. Extensive experiments on 32 multivariate time-series datasets show that our model is significantly more accurate than state-of-the-art convolution and transformer-based models. Code and models are open-sourced at \url{https://github.com/Navidfoumani/ConvTran}.

Authors (4)

Navid Mohammadi Foumani (6 papers)
Chang Wei Tan (16 papers)
Geoffrey I. Webb (62 papers)
Mahsa Salehi (21 papers)

Citations (43)

View on Semantic Scholar

Summary

Improving Position Encoding of Transformers for Multivariate Time Series Classification

The paper under discussion presents a novel approach to enhancing the performance of transformers when applied to multivariate time series classification (MTSC). Transformers, prominently utilized in natural language processing, face challenges when dealing with time series data due to their inherent lack of ordering information. The position encoding mechanisms for time series data have been less explored, and the innovative proposals in this paper aim to address this gap.

The authors introduce two new position encoding methods tailored for time series data: time Absolute Position Encoding (tAPE) and Efficient Relative Position Encoding (eRPE). Both these techniques are constructed to enhance the transformer’s ability to capture the sequential relationships prevalent in time series data.

Key Contributions

Time Absolute Position Encoding (tAPE):
- The tAPE method adapts the frequency parameters of sine and cosine functions used in position encoding to account for both the series length and the input embedding dimension.
- This adjustment aims to maintain the distance awareness and isotropic properties in the encoding space, which are crucial for effectively representing the sequences in time series data.
Efficient Relative Position Encoding (eRPE):
- Unlike traditional methods that involve extensive calculations and memory usage, eRPE uses a scalar representation to encode the relative distance information for time series data efficiently.
- This design reduces memory overhead and computational complexity, which helps mitigate overfitting in smaller datasets.
ConvTran Architecture:
- The paper proposes ConvTran, a novel neural architecture combining these position encoding methods with convolutional layers to capture both local temporal patterns and long-range dependencies.
- Extensive experiments on 32 benchmark datasets reveal that ConvTran consistently outperforms state-of-the-art models in MTSC, demonstrating significant accuracy improvements, particularly in datasets with ample training samples per class.

Experimental Validation

The paper extensively validates the proposed methods across a wide range of datasets from the UEA archive and larger datasets such as the Ford Challenge and Actitracker human activity recognition datasets. ConvTran’s performance was particularly noteworthy in situations where data abundance allowed for the full potential of the model to be realized.

The paper includes rigorous comparisons with existing deep learning models such as Fully Convolutional Networks, ResNet, Inception-Time, and other transformer-based models. The superior ranking of ConvTran in these experiments underscores the effectiveness of the proposed position encoding methods.

Implications and Future Directions

The implications of this research are twofold. Practically, the ConvTran model provides a powerful tool for time series classification in various domains, offering improved accuracy and efficiency. Theoretically, the paper pioneers in investigating position encoding for time series, paving the way for further exploration into advanced encoding techiques tailored for different types of sequential data.

Future work could involve extending these encoding strategies to other applications of transformers in time series, such as anomaly detection or forecasting. Investigating the proposed position encoding methods' adaptability and performance across different data distributions and scales could also yield further insights.

In summary, this paper makes a significant contribution to the field of time series analysis with transformers, providing robust encoding methods that address critical challenges in capturing sequential dependencies. The ConvTran architecture stands out as a leading model in MTSC, poised to influence future developments in time series-focused deep learning architectures.

PDF Markdown

Related Papers

GitHub

GitHub - Navidfoumani/ConvTran: This is a PyTorch implementation of ConvTran (111 stars)