Papers
Topics
Authors
Recent
Search
2000 character limit reached

FITS: Modeling Time Series with $10k$ Parameters

Published 6 Jul 2023 in cs.LG | (2307.03756v3)

Abstract: In this paper, we introduce FITS, a lightweight yet powerful model for time series analysis. Unlike existing models that directly process raw time-domain data, FITS operates on the principle that time series can be manipulated through interpolation in the complex frequency domain. By discarding high-frequency components with negligible impact on time series data, FITS achieves performance comparable to state-of-the-art models for time series forecasting and anomaly detection tasks, while having a remarkably compact size of only approximately $10k$ parameters. Such a lightweight model can be easily trained and deployed in edge devices, creating opportunities for various applications. The code is available in: \url{https://github.com/VEWOXIC/FITS}

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Practical approach to asynchronous multivariate time series anomaly detection and localization. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery; Data Mining, KDD ’21, pp.  2485–2494, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450383325. doi: 10.1145/3447548.3467174. URL https://doi.org/10.1145/3447548.3467174.
  2. The fast fourier transform. IEEE Spectrum, 4(12):63–70, 1967. doi: 10.1109/MSPEC.1967.5217220.
  3. N-hits: Neural hierarchical interpolation for time series forecasting. arXiv preprint arXiv:2201.12886, 2022a.
  4. Nhits: neural hierarchical interpolation for time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pp.  6989–6997, 2023.
  5. Deep generative model with hierarchical latent factors for time series anomaly detection. In International Conference on Artificial Intelligence and Statistics, pp.  1643–1654. PMLR, 2022b.
  6. Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery &amp Data Mining. ACM, jul 2018. doi: 10.1145/3219819.3219845. URL https://doi.org/10.11452F3219819.3219845.
  7. Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=cGDAkQo1C0p.
  8. Revisiting time series outlier detection: Definitions and benchmarks. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1), 2021. URL https://openreview.net/forum?id=r8IvOsnHchr.
  9. Fnet: Mixing tokens with fourier transforms, 2022.
  10. Scinet: Time series modeling and forecasting with sample convolution and interaction. In Advances in Neural Information Processing Systems, 2022a.
  11. Pyraformer: Low-complexity pyramidal attention for long-range time series modeling and forecasting. In International Conference on Learning Representations, 2022b. URL https://openreview.net/forum?id=0EXmFzUn5I.
  12. Swat: a water treatment testbed for research and training on ics security. In 2016 International Workshop on Cyber-physical Systems for Smart Water Networks (CySWater), pp.  31–36, 2016. doi: 10.1109/CySWater.2016.7469060.
  13. A time series is worth 64 words: Long-term forecasting with transformers. In International Conference on Learning Representations, 2023.
  14. N-BEATS: neural basis expansion analysis for interpretable time series forecasting. CoRR, abs/1905.10437, 2019. URL http://arxiv.org/abs/1905.10437.
  15. Timeseries anomaly detection using temporal hierarchical one-class network. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (eds.), Advances in Neural Information Processing Systems, volume 33, pp.  13016–13026. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/97e401a02082021fd24957f852e0e475-Paper.pdf.
  16. Robust anomaly detection for multivariate time series through stochastic recurrent neural network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery; Data Mining, KDD ’19, pp.  2828–2837, New York, NY, USA, 2019. Association for Computing Machinery. ISBN 9781450362016. doi: 10.1145/3292500.3330672. URL https://doi.org/10.1145/3292500.3330672.
  17. Etsformer: Exponential smoothing transformers for time-series forecasting, 2022.
  18. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34:22419–22430, 2021.
  19. Timesnet: Temporal 2d-variation modeling for general time series analysis. In International Conference on Learning Representations, 2023.
  20. Unsupervised anomaly detection via variational auto-encoder for seasonal KPIs in web applications. In Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW '18. ACM Press, 2018. doi: 10.1145/3178876.3185996. URL https://doi.org/10.1145/2F3178876.3185996.
  21. Anomaly transformer: Time series anomaly detection with association discrepancy, 2022.
  22. Are transformers effective for time series forecasting? 2023.
  23. Less is more: Fast multivariate time series forecasting with light sampling-oriented mlp structures. arXiv preprint arXiv:2207.01186, 2022.
  24. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp.  11106–11115, 2021.
  25. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In International Conference on Machine Learning, 2022a.
  26. FiLM: Frequency improved legendre memory model for long-term time series forecasting. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho (eds.), Advances in Neural Information Processing Systems, 2022b. URL https://openreview.net/forum?id=zTQdHSQUQWc.
Citations (51)

Summary

  • The paper presents FITS, a model using complex frequency interpolation with under 10k parameters to achieve high forecasting accuracy.
  • It leverages rFFT and irFFT operations with low-pass filtering to interpolate frequency components for efficient time series reconstruction.
  • Benchmark comparisons reveal that FITS outperforms larger models on datasets like ETTh1 and Weather, making it ideal for edge computing.

A Review of "FITS: Modeling Time Series with 10k Parameters"

The paper "FITS: Modeling Time Series with 10k Parameters" introduces an innovative approach to time series analysis via a model named FITS, which relies on complex frequency domain interpolation. This model is notable for its exceptionally modest parameter count, underlining its suitability for deployment on resource-constrained edge devices. The researchers propose a methodology that transcends traditional time-domain processing, leveraging interpolation within the complex frequency domain to achieve state-of-the-art performance in forecasting and anomaly detection tasks.

Overview of FITS

FITS operates by interpreting time series forecasting and reconstruction as tasks of frequency interpolation. This perspective facilitates the extrapolation of an input segment's frequency representation to generate extended time series outputs. Notably, this process relies on a complex-valued linear layer designed to learn amplitude scaling and phase shifts, ensuring the interpolation effectively captures essential frequency domain patterns.

Despite functioning primarily in the frequency domain, FITS employs real Fast Fourier Transform (rFFT) operations to transition from the time domain. Upon interpolation, inverse rFFT (irFFT) is used to revert to the time domain, concluding in an extended segment that is ready for supervisory tasks. FITS integrates a low-pass filter to retain critical frequency components while discarding high-frequency noise. This fusion of frequency-domain operations and traditional time-domain supervision allows FITS to achieve its compact yet potent architecture.

Quantitative Results and Comparative Analysis

FITS demonstrates exemplary performance on several benchmark datasets such as ETT (ETTh1, ETTh2, ETTm1, ETTm2), Weather, Electricity, Traffic, and short-term forecasting datasets like M4. The model consistently yields competitive results, often outperforming larger models like TimesNet and transformer-based architectures such as FEDformer and Informer. For instance, FITS achieves superior MSE results on datasets such as ETTh1 and Weather, with significant improvements over the runner-up models.

Crucially, FITS accomplishes this with substantially fewer parameters—ranging from 4.5K to 16K—compared to other models that utilize hundreds of thousands to millions of parameters. This characteristic underscores FITS as an efficient solution for time series analysis, especially in scenarios with limited computational resources.

Implications and Future Directions

The development of FITS has significant implications for the deployment of time series models in edge computing environments. Its low computational footprint makes it a suitable candidate for tasks like anomaly detection and forecasting in domains ranging from industrial IoT to financial analytics. By effectively minimizing computational and memory requirements through its frequency-domain approach, FITS sets a new standard for model efficacy in low-resource and real-time applications.

Looking forward, the insights provided by this research could pave the way for further exploration into complex-valued neural networks and their broader applicability in AI. Future research may focus on extending this methodology to larger-scale networks, such as complex-valued transformers, to understand the scalability of frequency domain-based approaches in modeling complex temporal dynamics.

In conclusion, the introduction of FITS marks a significant step forward in the field of time series analysis, demonstrating that state-of-the-art performance can be achieved with a fraction of the parameters traditionally required. This work not only highlights the efficacy of frequency domain interpolation but also calls for a reevaluation of prevailing methodologies in lightweight model design.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Authors (3)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 132 likes about this paper.