Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Low-Rank Robust Subspace Tensor Clustering for Metro Passenger Flow Modeling (2404.04403v1)

Published 5 Apr 2024 in stat.ME and cs.AI

Abstract: Tensor clustering has become an important topic, specifically in spatio-temporal modeling, due to its ability to cluster spatial modes (e.g., stations or road segments) and temporal modes (e.g., time of the day or day of the week). Our motivating example is from subway passenger flow modeling, where similarities between stations are commonly found. However, the challenges lie in the innate high-dimensionality of tensors and also the potential existence of anomalies. This is because the three tasks, i.e., dimension reduction, clustering, and anomaly decomposition, are inter-correlated to each other, and treating them in a separate manner will render a suboptimal performance. Thus, in this work, we design a tensor-based subspace clustering and anomaly decomposition technique for simultaneously outlier-robust dimension reduction and clustering for high-dimensional tensors. To achieve this, a novel low-rank robust subspace clustering decomposition model is proposed by combining Tucker decomposition, sparse anomaly decomposition, and subspace clustering. An effective algorithm based on Block Coordinate Descent is proposed to update the parameters. Prudent experiments prove the effectiveness of the proposed framework via the simulation study, with a gain of +25% clustering accuracy than benchmark methods in a hard case. The interrelations of the three tasks are also analyzed via ablation studies, validating the interrelation assumption. Moreover, a case study in the station clustering based on real passenger flow data is conducted, with quite valuable insights discovered.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. Aggarwal CC (2015) Outlier analysis. Data mining, 237–263 (Springer).
  2. Du B, Zhang L (2014) A discriminative metric learning based anomaly detection method. IEEE Transactions on Geoscience and Remote Sensing 52(11):6844–6857.
  3. Fanaee-T H, Gama J (2016) Tensor-based anomaly detection: An interdisciplinary survey. Knowledge-Based Systems 98:130–147.
  4. Hitchcock FL (1927) The expression of a tensor or a polyadic as a sum of products. Journal of Mathematics and Physics 6(1-4):164–189.
  5. Kolda TG, Bader BW (2009) Tensor decompositions and applications. SIAM review 51(3):455–500.
  6. Li Z (2021) Tensor topic models with graphs and applications on individualized travel patterns. 2021 IEEE 37th International Conference on Data Engineering (ICDE), 2756–2761 (IEEE).
  7. Lock EF (2018) Tensor-on-tensor regression. Journal of Computational and Graphical Statistics 27(3):638–647.
  8. Nomikos P, MacGregor JF (1994) Monitoring batch processes using multiway principal component analysis. AIChE Journal 40(8):1361–1375.
  9. Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Transactions on knowledge and data engineering 22(10):1345–1359.
  10. Sun WW, Li L (2019) Dynamic tensor clustering. Journal of the American Statistical Association 114(528):1894–1907.
  11. Tucker LR (1966) Some mathematical notes on three-mode factor analysis. Psychometrika 31(3):279–311.
  12. Vidal R (2011) Subspace clustering. IEEE Signal Process. Mag. 28(2):52–68.
  13. Vidal R, Favaro P (2014) Low rank subspace clustering (LRSC). Pattern Recognit. Lett. 43:47–61.
  14. von Luxburg U (2007) A tutorial on spectral clustering. Stat. Comput. 17(4):395–416.
  15. Von Luxburg U (2007) A tutorial on spectral clustering. Statistics and computing 17:395–416.
Citations (5)

Summary

We haven't generated a summary for this paper yet.