Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sketch and shift: a robust decoder for compressive clustering (2312.09940v2)

Published 15 Dec 2023 in cs.LG and stat.ML

Abstract: Compressive learning is an emerging approach to drastically reduce the memory footprint of large-scale learning, by first summarizing a large dataset into a low-dimensional sketch vector, and then decoding from this sketch the latent information needed for learning. In light of recent progress on information preservation guarantees for sketches based on random features, a major objective is to design easy-to-tune algorithms (called decoders) to robustly and efficiently extract this information. To address the underlying non-convex optimization problems, various heuristics have been proposed. In the case of compressive clustering, the standard heuristic is CL-OMPR, a variant of sliding Frank-Wolfe. Yet, CL-OMPR is hard to tune, and the examination of its robustness was overlooked. In this work, we undertake a scrutinized examination of CL-OMPR to circumvent its limitations. In particular, we show how this algorithm can fail to recover the clusters even in advantageous scenarios. To gain insight, we show how the deficiencies of this algorithm can be attributed to optimization difficulties related to the structure of a correlation function appearing at core steps of the algorithm. To address these limitations, we propose an alternative decoder offering substantial improvements over CL-OMPR. Its design is notably inspired from the mean shift algorithm, a classic approach to detect the local maxima of kernel density estimators. The proposed algorithm can extract clustering information from a sketch of the MNIST dataset that is 10 times smaller than previously.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. Random fourier features for kernel ridge regression: Approximation bounds and statistical guarantees. In International conference on machine learning, pp. 253–262. PMLR, 2017.
  2. A. Belhadji and R. Gribonval. Revisiting rip guarantees for sketching operators on mixture models. arXiv preprint arXiv:2312.05573, 2023.
  3. M. A. Carreira-Perpinan. Gaussian mean-shift is an em algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(5):767–776, 2007.
  4. Large-scale high-dimensional clustering with fast sketching. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.  4714–4718. IEEE, 2018.
  5. Mean nyström embeddings for adaptive compressive learning. In International Conference on Artificial Intelligence and Statistics, pp.  9869–9889. PMLR, 2022.
  6. Y.-C. Chen. A tutorial on kernel density estimation and recent advances. Biostatistics & Epidemiology, 1(1):161–187, 2017.
  7. Y. Cheng. Mean shift, mode seeking, and clustering. IEEE transactions on pattern analysis and machine intelligence, 17(8):790–799, 1995.
  8. D. Comaniciu and P. Meer. Mean shift: A robust approach toward feature space analysis. IEEE Transactions on pattern analysis and machine intelligence, 24(5):603–619, 2002.
  9. Synopses for massive data: Samples, histograms, wavelets, sketches. Foundations and Trends® in Databases, 4(1–3):1–294, 2011.
  10. K. Fukunaga and L. Hostetler. The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Transactions on information theory, 21(1):32–40, 1975.
  11. Y. A. Ghassabeh. A sufficient condition for the convergence of the mean shift algorithm with gaussian kernel. Journal of Multivariate Analysis, 135:1–10, 2015.
  12. L. Giffon and R. Gribonval. Compressive clustering with an optical processing unit. GRETSI 2023-XXIXème Colloque Francophone de Traitement du Signal et des Images, 2022.
  13. Compressive statistical learning with random feature moments. Mathematical Statistics and Learning, 3(2):113–164, 2021a.
  14. Sketching data sets for large-scale learning: Keeping only what you need. IEEE Signal Processing Magazine, 38(5):12–36, 2021b.
  15. On convergence of epanechnikov mean shift. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
  16. A. K. Jain. Data clustering: 50 years beyond k-means. Pattern recognition letters, 31(8):651–666, 2010.
  17. N. Keriven. Apprentissage de modèles de mélange à large échelle par Sketching. PhD thesis, Rennes 1, 2017.
  18. Compressive k-means. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.  6369–6373. IEEE, 2017.
  19. Sketching for large-scale learning of mixture models. Information and Inference: A Journal of the IMA, 7(3):447–508, 2018a.
  20. Blind Source Separation Using Mixtures of Alpha-Stable Distributions. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.  771–775, April 2018b. doi: 10.1109/ICASSP.2018.8462095. URL https://ieeexplore.ieee.org/document/8462095. ISSN: 2379-190X.
  21. A note on the convergence of the mean shift. Pattern recognition, 40(6):1756–1762, 2007.
  22. S. Lloyd. Least squares quantization in pcm. IEEE transactions on information theory, 28(2):129–137, 1982.
  23. S. Mallat and Z. Zhang. Matching pursuits with time-frequency dictionaries. IEEE Transactions on signal processing, 41(12):3397–3415, 1993.
  24. Kernel mean embedding of distributions: A review and beyond. Foundations and Trends® in Machine Learning, 10(1-2):1–141, 2017.
  25. M. Muja and D. G. Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. VISAPP (1), 2(331-340):2, 2009.
  26. Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In Proceedings of 27th Asilomar conference on signals, systems and computers, pp.  40–44. IEEE, 1993.
  27. E. Quemener and M. Corvellec. Sidus—the solution for extreme deduplication of an operating system. Linux Journal, 2013(235):3, 2013.
  28. A. Rahimi and B. Recht. Random features for large-scale kernel machines. Advances in neural information processing systems, 20, 2007.
  29. Sparse recovery over continuous dictionaries-just discretize. In 2013 Asilomar Conference on Signals, Systems and Computers, pp.  1043–1047. IEEE, 2013.
  30. Projected gradient descent for non-convex sparse spike estimation. IEEE Signal Processing Letters, 27:1110–1114, 2020.
  31. A. Vedaldi and B. Fulkerson. Vlfeat: An open and portable library of computer vision algorithms. In Proceedings of the 18th ACM international conference on Multimedia, pp.  1469–1472, 2010.
  32. R. Yamasaki and T. Tanaka. Properties of mean shift. IEEE transactions on pattern analysis and machine intelligence, 42(9):2273–2286, 2019.
  33. R. Yamasaki and T. Tanaka. Convergence analysis of mean shift. arXiv preprint arXiv:2305.08463, 2023.

Summary

We haven't generated a summary for this paper yet.