TSLANet: Lightweight Adaptive Time Series Network
- TSLANet is a lightweight adaptive network that leverages convolutional and spectral processing blocks to analyze multivariate time series data efficiently.
- The Adaptive Spectral Block uses FFT-based transformation with learnable thresholds to denoise and extract both global and local features for robust feature representation.
- Self-supervised pretraining with masked autoencoding and fine-tuning losses in TSLANet mitigates overfitting and reduces computational complexity compared to Transformer models.
A Time Series Lightweight Adaptive Network (TSLANet) is a universal convolutional framework for multivariate time series analysis designed to efficiently capture both long- and short-range dependencies, while providing resilience to noise and achieving strong performance across classification, forecasting, and anomaly detection tasks. Directly addressing the inefficiencies and overfitting tendencies of Transformer-based models, TSLANet leverages spectral and convolutional processing blocks in tandem with self-supervised learning to create a scalable, robust, and lightweight alternative for time series representation learning (Eldele et al., 2024).
1. Architectural Overview
TSLANet processes an input multivariate time series in three main stages: patch embedding, a series of stacked TSLANet layers, and a task-specific linear head. Each TSLANet layer sequentially applies two principal modules—the Adaptive Spectral Block (ASB) and the Interactive Convolution Block (ICB)—forming the core of its processing pipeline:
- Patch Embedding: The input is segmented into patches, each embedded and summed with a learnable positional encoding.
- Layer Sequence: Each layer receives the output of its predecessor, structured as:
- Head: After layers, a linear head produces class logits, regression outputs (for forecasting), or anomaly score reconstructions.
2. Adaptive Spectral Block (ASB)
The Adaptive Spectral Block constitutes the spectral processing unit of TSLANet, targeting both denoising and efficient feature extraction:
- Fourier Transform: Embedded patches are transformed to the frequency domain via FFT:
- Adaptive Thresholding: Compute power spectrum . Apply a binary mask , where is a learnable (potentially channelwise) threshold optimized by backpropagation. Frequencies below this power are zeroed: .
- Global/Local Spectral Filtering: Two learnable filters operate concurrently:
where . Their sum aggregates global periodic and local denoised patterns.
- Inverse FFT: Return to the time domain via IFFT, producing denoised and adaptively filtered representations:
The combination of learnable spectral masking and filtering distinguishes ASB, enhancing both global context capture and noise robustness.
3. Interactive Convolution Block (ICB)
Following ASB, the Interactive Convolution Block captures multi-scale temporal interactions via parallel convolutional pathways:
- Parallel Convolutions: Two 1D convolutions—Conv1 (kernel size ) and Conv2 (kernel size )—extract fine and coarse features, respectively. Outputs are modulated cross-scale:
- Aggregation and Output: Summed activations are passed through a pointwise Conv3 for final block output:
By promoting rich feature interactions across temporal scales, ICB supports robust pattern recognition in diverse time series contexts.
4. Self-Supervised Pretraining and Training Objectives
TSLANet employs dataset-specific self-supervised learning to enhance feature quality:
- Masked Autoencoding: Random subsets of patches are masked, and the network reconstructs their raw signals. The mean-squared reconstruction loss,
compels the model to attend to both global and local dependencies.
- Fine-tuning Losses: For classification, label-smoothed cross-entropy is used:
This dual-stage training harnesses unlabeled data and stabilizes learning, particularly beneficial for small-data regimes.
5. Empirical Performance and Robustness
Extensive benchmarking demonstrates TSLANet's effectiveness across canonical time series tasks:
| Task | Benchmark Datasets | Metric | TSLANet Performance | Notable Comparison |
|---|---|---|---|---|
| Classification | UCR, UEA, Biomedical, HAR | Accuracy (%) | UCR: 83.18; UEA: 72.73; Bio: 90.24; HAR: 97.46 | Outperforms ROCKET, TS-TCC, 2%+ over best |
| Forecasting | ECL, ETTh1/2, ETTm1/2, Exchange, Traffic, Weather | MSE, MAE | 2nd lowest MSE in 7/8 tasks, 3% MSE↓ (ETT), 3.8%↓ (Weather) | Beats PatchTST (select tasks) |
| Anomaly Detection | SMD, MSL, SMAP, SWaT, PSM | F1-score (%) | 87.54 (avg, best), +0.82% over GPT4TS | Highest F1, resilience to noise |
TSLANet maintains accuracy within 5% of clean performance under Gaussian noise perturbations, outperforming Transformer-based models in robustness. On small datasets, such as uWaveGestureLibraryAll, it retains over 90% accuracy with just 20% of the training data, where comparison models suffer significant degradation.
6. Complexity, Scalability, and Ablation
TSLANet achieves complexity in the spectral processing step, compared to the of Transformer self-attention. On UEA Heartbeat, it requires 93% fewer FLOPs and 84% fewer parameters than PatchTST, while achieving 77.56% accuracy (versus PatchTST’s 69.76%).
Ablation studies confirm component necessity:
| Component Removed | Impact (FordA accuracy) | Impact (ETTh1 MSE) |
|---|---|---|
| ASB | 93.1 → 87.3% | 0.413 → 0.421 |
| ASB-Local | 92.7% | 0.417 |
| ICB | 91.3% | 0.419 |
| Pretraining | 92.5% | 0.415 |
The ASB’s adaptive denoising and global context capture are determinative for accuracy and robustness.
7. Implementation Specifications
Key implementation parameters and training protocols:
- Optimizer: AdamW
- Classification: learning rate , weight decay , pretrain 50 epochs, fine-tune 100 epochs.
- Forecasting/Anomaly: learning rate , weight decay , pretrain 10 epochs, fine-tune 20 epochs.
- Batching: Overlap stride set to half the patch size.
- Hardware: Model trained on NVIDIA RTX A6000.
- Code: Publicly released at https://github.com/emadeldeen24/TSLANet.
TSLANet demonstrates a practical balance of accuracy, robustness, and efficiency by combining FFT-based adaptive spectral filtering, interactive convolutions, and masked autoencoder pretraining. This combination enables TSLANet to surpass state-of-the-art Transformer and MLP models across diverse time series tasks, validated through comprehensive empirical studies (Eldele et al., 2024).