Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Locality Regularized Reconstruction: Structured Sparsity and Delaunay Triangulations (2405.00837v1)

Published 1 May 2024 in cs.LG, eess.SP, math.OC, and stat.ML

Abstract: Linear representation learning is widely studied due to its conceptual simplicity and empirical utility in tasks such as compression, classification, and feature extraction. Given a set of points $[\mathbf{x}_1, \mathbf{x}_2, \ldots, \mathbf{x}_n] = \mathbf{X} \in \mathbb{R}{d \times n}$ and a vector $\mathbf{y} \in \mathbb{R}d$, the goal is to find coefficients $\mathbf{w} \in \mathbb{R}n$ so that $\mathbf{X} \mathbf{w} \approx \mathbf{y}$, subject to some desired structure on $\mathbf{w}$. In this work we seek $\mathbf{w}$ that forms a local reconstruction of $\mathbf{y}$ by solving a regularized least squares regression problem. We obtain local solutions through a locality function that promotes the use of columns of $\mathbf{X}$ that are close to $\mathbf{y}$ when used as a regularization term. We prove that, for all levels of regularization and under a mild condition that the columns of $\mathbf{X}$ have a unique Delaunay triangulation, the optimal coefficients' number of non-zero entries is upper bounded by $d+1$, thereby providing local sparse solutions when $d \ll n$. Under the same condition we also show that for any $\mathbf{y}$ contained in the convex hull of $\mathbf{X}$ there exists a regime of regularization parameter such that the optimal coefficients are supported on the vertices of the Delaunay simplex containing $\mathbf{y}$. This provides an interpretation of the sparsity as having structure obtained implicitly from the Delaunay triangulation of $\mathbf{X}$. We demonstrate that our locality regularized problem can be solved in comparable time to other methods that identify the containing Delaunay simplex.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. M. W. Libbrecht and W. S. Noble, “Machine learning applications in genetics and genomics,” Nature Reviews Genetics, vol. 16, no. 6, pp. 321–332, 2015.
  2. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  3. H. Hotelling, “Analysis of a complex of statistical variables into principal components.,” Journal of Educational Psychology, vol. 24, no. 6, p. 417, 1933.
  4. A. Hyvärinen and E. Oja, “Independent component analysis: algorithms and applications,” Neural networks, vol. 13, no. 4-5, pp. 411–430, 2000.
  5. E. J. Candès, X. Li, Y. Ma, and J. Wright, “Robust principal component analysis?,” Journal of the ACM, vol. 58, no. 3, pp. 1–37, 2011.
  6. B. Schölkopf, A. Smola, and K.-R. Müller, “Kernel principal component analysis,” in International Conference on Artificial Neural Networks, pp. 583–588, Springer, 1997.
  7. J. B. Tenenbaum, V. De Silva, and J. C. Langford, “A global geometric framework for nonlinear dimensionality reduction,” Science, vol. 290, no. 5500, pp. 2319–2323, 2000.
  8. S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction by locally linear embedding,” Science, vol. 290, no. 5500, pp. 2323–2326, 2000.
  9. M. Belkin and P. Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation,” Neural Computation, vol. 15, no. 6, pp. 1373–1396, 2003.
  10. R. R. Coifman and S. Lafon, “Diffusion maps,” Applied and Computational Harmonic Analysis, vol. 21, no. 1, pp. 5–30, 2006.
  11. N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol. 1, pp. 886–893, Ieee, 2005.
  12. D. G. Lowe, “Object recognition from local scale-invariant features,” in IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157, 1999.
  13. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, vol. 25, 2012.
  14. S. Hershey, S. Chaudhuri, D. P. Ellis, J. F. Gemmeke, A. Jansen, R. C. Moore, M. Plakal, D. Platt, R. A. Saurous, B. Seybold, et al., “CNN architectures for large-scale audio classification,” in IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 131–135, 2017.
  15. T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, et al., “Transformers: State-of-the-art natural language processing,” in Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, pp. 38–45, 2020.
  16. T. T. Cai and L. Wang, “Orthogonal matching pursuit for sparse signal recovery with noise,” IEEE Transactions on Information Theory, vol. 57, no. 7, pp. 4680–4688, 2011.
  17. D. Needell and J. A. Tropp, “CoSaMP: Iterative signal recovery from incomplete and inaccurate samples,” Applied and Computational Harmonic Analysis, vol. 26, no. 3, pp. 301–321, 2009.
  18. E. J. Candes and T. Tao, “Near-optimal signal recovery from random projections: Universal encoding strategies?,” IEEE Transactions on Information Theory, vol. 52, no. 12, pp. 5406–5425, 2006.
  19. E. J. Candes, J. K. Romberg, and T. Tao, “Stable signal recovery from incomplete and inaccurate measurements,” Communications on Pure and Applied Mathematics, vol. 59, no. 8, pp. 1207–1223, 2006.
  20. R. Chartrand, “Exact reconstruction of sparse signals via nonconvex minimization,” IEEE Signal Processing Letters, vol. 14, no. 10, pp. 707–710, 2007.
  21. S. Foucart and M.-J. Lai, “Sparsest solutions of underdetermined linear systems via lqsubscript𝑙𝑞l_{q}italic_l start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT-minimization for 0<q≤10𝑞10<q\leq 10 < italic_q ≤ 1,” Applied and Computational Harmonic Analysis, vol. 26, no. 3, pp. 395–407, 2009.
  22. S. Huang and T. D. Tran, “Sparse signal recovery via generalized entropy functions minimization,” IEEE Transactions on Signal Processing, vol. 67, no. 5, pp. 1322–1337, 2018.
  23. M. A. Khajehnejad, W. Xu, A. S. Avestimehr, and B. Hassibi, “Weighted l𝑙litalic_l1 minimization for sparse recovery with prior information,” in IEEE International Symposium on Information Theory, pp. 483–487, IEEE, 2009.
  24. N. Vaswani and W. Lu, “Modified-cs: Modifying compressive sensing for problems with partially known support,” IEEE Transactions on Signal Processing, vol. 58, no. 9, pp. 4595–4607, 2010.
  25. J. Huang and T. Zhang, “The benefit of group sparsity,” Annals of Statistics, vol. 38, no. 1, pp. 1978–2004, 2010.
  26. M. Yuan and Y. Lin, “Model selection and estimation in regression with grouped variables,” Journal of the Royal Statistical Society Series B: Statistical Methodology, vol. 68, no. 1, pp. 49–67, 2006.
  27. P. Sprechmann, I. Ramirez, G. Sapiro, and Y. Eldar, “Collaborative hierarchical sparse modeling,” in 2010 44th Annual Conference on Information Sciences and Systems (CISS), pp. 1–6, IEEE, 2010.
  28. S. Foucart, “Recovering jointly sparse vectors via hard thresholding pursuit,” in Sampling Theory and Applications, 2011.
  29. E. Elhamifar and R. Vidal, “Sparse manifold clustering and embedding,” in Advances in Neural Information Processing Systems, vol. 24, pp. 55–63, 2011.
  30. B. Delaunay, “Sur la sphere vide,” Izv. Akad. Nauk SSSR, Otdelenie Matematicheskii i Estestvennyka Nauk, vol. 7, no. 793-800, pp. 1–2, 1934.
  31. Boca Raton: CRC Press, 2016.
  32. N. P. Weatherill, “Delaunay triangulation in computational fluid dynamics,” Computers & Mathematics with Applications, vol. 24, no. 5-6, pp. 129–150, 1992.
  33. A. Gillette and E. Kur, “Data-driven geometric scale detection via Delaunay interpolation,” arXiv preprint arXiv:2203.05685, 2022.
  34. S. M. Omohundro, “The delaunay triangulation and function learning,” tech. rep., International Computer Science Institute, 1989.
  35. L. Chen and J.-c. Xu, “Optimal delaunay triangulations,” Journal of Computational Mathematics, pp. 299–308, 2004.
  36. A. Tasissa, P. Tankala, J. M. Murphy, and D. Ba, “K-deep simplex: Manifold learning via local dictionaries,” IEEE Transactions on Signal Processing, vol. 71, pp. 3741–3754, 2023.
  37. A. Beck, First-order methods in optimization. Philadelphia: SIAM, 2017.
  38. T. H. Chang, L. T. Watson, T. C. Lux, A. R. Butt, K. W. Cameron, and Y. Hong, “Delaunaysparse: Interpolation via a sparse subset of the Delaunay triangulation in medium to high dimensions,” ACM Transactions on Mathematical Software, vol. 46, no. 4, pp. 1–20, 2020.
  39. G. Davis, S. Mallat, and M. Avellaneda, “Adaptive greedy approximations,” Constructive Approximation, vol. 13, no. 1, pp. 57–98, 1997.
  40. Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad, “Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition,” in Proceedings of 27th Asilomar conference on signals, systems and computers, pp. 40–44, IEEE, 1993.
  41. J. A. Tropp and A. C. Gilbert, “Signal recovery from random measurements via orthogonal matching pursuit,” IEEE Transactions on Information Theory, vol. 53, no. 12, pp. 4655–4666, 2007.
  42. J. A. Tropp, “Greed is good: Algorithmic results for sparse approximation,” IEEE Transactions on Information Theory, vol. 50, no. 10, pp. 2231–2242, 2004.
  43. S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decomposition by basis pursuit,” SIAM review, vol. 43, no. 1, pp. 129–159, 2001.
  44. S. Foucart and H. Rauhut, A Mathematical Introduction to Compressive Sensing. Applied and Numerical Harmonic Analysis, New York: Springer, 2013.
  45. L. Lian, A. Liu, and V. K. Lau, “Weighted lasso for sparse recovery with statistical prior support information,” IEEE Transactions on Signal Processing, vol. 66, no. 6, pp. 1607–1618, 2018.
  46. H. Mansour and R. Saab, “Recovery analysis for weighted l1-minimization using the null space property,” Applied and Computational Harmonic Analysis, vol. 43, no. 1, pp. 23–38, 2017.
  47. J. Ho, Y. Xie, and B. Vemuri, “On a nonlinear generalization of sparse coding and dictionary learning,” in International conference on machine learning, pp. 1480–1488, PMLR, 2013.
  48. M. Werenski, R. Jiang, A. Tasissa, S. Aeron, and J. M. Murphy, “Measure estimation in the barycentric coding model,” in International Conference on Machine Learning, pp. 23781–23803, PMLR, 2022.
  49. M.-H. Do, J. Feydy, and O. Mula, “Approximation and structured prediction with sparse wasserstein barycenters,” arXiv preprint arXiv:2302.05356, 2023.
  50. M. Mueller, S. Aeron, J. M. Murphy, and A. Tasissa, “Geometrically regularized wasserstein dictionary learning,” in Topological, Algebraic and Geometric Learning Workshops 2023, pp. 384–403, PMLR, 2023.
  51. H. Edelsbrunner and R. Seidel, “Voronoi diagrams and arrangements,” in Proceedings of the first annual symposium on Computational geometry, pp. 251–262, 1985.
  52. K. Fukuda et al., “Frequently asked questions in polyhedral computation,” 2004.
  53. H. Edelsbrunner, “An acyclicity theorem for cell complexes in d dimensions,” in Proceedings of the fifth annual symposium on Computational geometry, pp. 145–151, 1989.
  54. Philadelphia: SIAM, 2014.
  55. M. S. Andersen, J. Dahl, and L. Vandenberghe, “Cvxopt: a python package for convex optimization, version 1.3,” 2013.
  56. M. R. Osborne, B. Presnell, and B. A. Turlach, “A new approach to variable selection in least squares problems,” IMA journal of numerical analysis, vol. 20, no. 3, pp. 389–403, 2000.
  57. R. J. Tibshirani and J. Taylor, “The solution path of the generalized lasso,” The Annals of Statistics, vol. 39, no. 3, p. 1335, 2011.
  58. B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, “Least angle regression,” The Annals of Statistics, vol. 32, no. 2, pp. 407–499, 2004.
  59. D. L. Donoho and Y. Tsaig, “Fast solution of ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT -norm minimization problems when the solution may be sparse,” IEEE Transactions on Information Theory, vol. 54, no. 11, pp. 4789–4812, 2008.
  60. B. Gu and V. S. Sheng, “A solution path algorithm for general parametric quadratic programming problem,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 9, pp. 4462–4472, 2017.
  61. B. Gu and C. Ling, “A new generalized error path algorithm for model selection,” in International conference on machine learning, pp. 2549–2558, PMLR, 2015.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com