Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A quasi-polynomial time algorithm for Multi-Dimensional Scaling via LP hierarchies (2311.17840v2)

Published 29 Nov 2023 in cs.DS, cs.LG, and stat.ML

Abstract: Multi-dimensional Scaling (MDS) is a family of methods for embedding an $n$-point metric into low-dimensional Euclidean space. We study the Kamada-Kawai formulation of MDS: given a set of non-negative dissimilarities ${d_{i,j}}{i , j \in [n]}$ over $n$ points, the goal is to find an embedding ${x_1,\dots,x_n} \in \mathbb{R}k$ that minimizes [\text{OPT} = \min{x} \mathbb{E}{i,j \in [n]} \left[ \left(1-\frac{|x_i - x_j|}{d{i,j}}\right)2 \right] ] Kamada-Kawai provides a more relaxed measure of the quality of a low-dimensional metric embedding than the traditional bi-Lipschitz-ness measure studied in theoretical computer science; this is advantageous because strong hardness-of-approximation results are known for the latter, Kamada-Kawai admits nontrivial approximation algorithms. Despite its popularity, our theoretical understanding of MDS is limited. Recently, Demaine, Hesterberg, Koehler, Lynch, and Urschel (arXiv:2109.11505) gave the first approximation algorithm with provable guarantees for Kamada-Kawai in the constant-$k$ regime, with cost $\text{OPT} +\epsilon$ in $n2 2{\text{poly}(\Delta/\epsilon)}$ time, where $\Delta$ is the aspect ratio of the input. In this work, we give the first approximation algorithm for MDS with quasi-polynomial dependency on $\Delta$: we achieve a solution with cost $\tilde{O}(\log \Delta)\text{OPT}{\Omega(1)}+\epsilon$ in time $n{O(1)}2{\text{poly}(\log(\Delta)/\epsilon)}$. Our approach is based on a novel analysis of a conditioning-based rounding scheme for the Sherali-Adams LP Hierarchy. Crucially, our analysis exploits the geometry of low-dimensional Euclidean space, allowing us to avoid an exponential dependence on the aspect ratio. We believe our geometry-aware treatment of the Sherali-Adams Hierarchy is an important step towards developing general-purpose techniques for efficient metric optimization algorithms.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Minimum-distortion embedding. Foundations and Trends® in Machine Learning, 14(3):211–378, 2021.
  2. On the approximability of numerical taxonomy (fitting distances by tree metrics). SIAM J. Comput., 28(3):1073–1085, 1999. Announced at SODA 1996.
  3. Fitting tree metrics: Hierarchical clustering and phylogeny. SIAM J. Comput., 40(5):1275–1291, 2011. Announced at FOCS 2005.
  4. An analysis of the t-sne algorithm for data visualization. In Sébastien Bubeck, Vianney Perchet, and Philippe Rigollet, editors, Conference On Learning Theory, COLT 2018, Stockholm, Sweden, 6-9 July 2018, volume 75 of Proceedings of Machine Learning Research, pages 1455–1462. PMLR, 2018.
  5. Mihai Badoiu. Approximation algorithm for embedding metrics into a two-dimensional space. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, January 12-14, 2003, Baltimore, Maryland, USA, pages 434–443. ACM/SIAM, 2003.
  6. Yair Bartal. Probabilistic approximations of metric spaces and its algorithmic applications. In FOCS, pages 184–193, 1996.
  7. Low-distortion embeddings of general metrics into the line. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 225–233, 2005.
  8. Embedding ultrametrics into low-dimensional spaces. In Proceedings of the twenty-second annual symposium on Computational geometry, pages 187–196, 2006.
  9. Dimensionality reduction: theoretical perspective on practical measures. Advances in Neural Information Processing Systems, 32, 2019.
  10. Modern multidimensional scaling: Theory and applications. Springer Science & Business Media, 2005.
  11. Rounding semidefinite programming hierarchies via global correlation. In 2011 ieee 52nd annual symposium on foundations of computer science, pages 472–481. IEEE, 2011.
  12. The isomap algorithm and topological stability. Science, 295(5552):7–7, 2002.
  13. Multidimensional scaling. CRC press, 2000.
  14. Improving ultrametrics embeddings through coresets. In ICML, 2021.
  15. Fitting distances by tree metrics minimizing the total error within a constant factor. In 62nd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2021, Denver, CO, USA, February 7-10, 2022, pages 468–479. IEEE, 2021.
  16. The spread of obesity in a large social network over 32 years. New England journal of medicine, 357(4):370–379, 2007.
  17. The collective dynamics of smoking in a large social network. New England journal of medicine, 358(21):2249–2258, 2008.
  18. Chaomei Chen. Searching for intellectual turning points: Progressive knowledge domain visualization. Proceedings of the National Academy of Sciences, 101(suppl_1):5303–5310, 2004.
  19. On efficient low distortion ultrametric embedding. In ICML, volume 119, pages 2078–2088, 2020.
  20. Correlation clustering with sherali-adams. In 63rd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2022, Denver, CO, USA, October 31 - November 3, 2022, pages 651–661. IEEE, 2022.
  21. Kedar Dhamdhere. Approximating additive distortion of embeddings into line metrics. In APPROX-RANDOM, pages 96–104, 2004.
  22. Multidimensional scaling: Approximation and complexity. In International Conference on Machine Learning, pages 2568–2578. PMLR, 2021.
  23. DIMACS. Working group on algorithms for multidimensional scaling i. http://dimacs.rutgers.edu/archive/SpecialYears/2001_Data/Algorithms/program.html, 2001. Accessed: 2023-11-08.
  24. Scheduling with communication delays via lp hierarchies and clustering. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 822–833. IEEE, 2020.
  25. 13 theory of multidimensional scaling. Handbook of statistics, 2:285–316, 1982.
  26. Efficient algorithms for inverting evolution. J. ACM, 46(4):437–449, 1999. Announced at STOC 1996.
  27. A robust model for finding optimal evolutionary trees. Algorithmica, 13(1/2):155–179, 1995. Announced at STOC 1993.
  28. A tight bound on approximating arbitrary metrics by tree metrics. J. Comput. Syst. Sci., 69(3):485–497, 2004. Announced at STOC 2003.
  29. Jan W Gooch. Encyclopedic dictionary of polymers, volume 1. Springer Science & Business Media, 2010.
  30. Faster SDP hierarchy solvers for local rounding algorithms. In 53rd Annual IEEE Symposium on Foundations of Computer Science, FOCS 2012, New Brunswick, NJ, USA, October 20-23, 2012, pages 197–206. IEEE Computer Society, 2012.
  31. Mapping the structural core of human cerebral cortex. PLoS biology, 6(7):e159, 2008.
  32. Fitting points on the real line and its application to rh mapping. In Algorithms—ESA’98: 6th Annual European Symposium Venice, Italy, August 24–26, 1998 Proceedings 6, pages 465–476. Springer, 1998.
  33. Approximating the best-fit tree under lpp{}_{\mbox{p}}start_FLOATSUBSCRIPT p end_FLOATSUBSCRIPT norms. In APPROX-RANDOM, pages 123–133, 2005.
  34. Constructing a tree from homeomorphic subtrees, with applications to computational evolutionary biology. Algorithmica, 24(1):1–13, 1999. Announced at SODA 1996.
  35. Stochastic neighbor embedding. Advances in neural information processing systems, 15, 2002.
  36. Lars Ivansson. Computational aspects of radiation hybrid mapping. PhD thesis, Numerisk analys och datalogi, 2000.
  37. Mean-field approximation, convex hierarchies, and the optimality of correlation rounding: a unified perspective. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 1226–1236, 2019.
  38. An algorithm for drawing general undirected graphs. Information processing letters, 31(1):7–15, 1989.
  39. Integrality gaps of linear and semi-definite programming relaxations for knapsack. In Integer Programming and Combinatoral Optimization: 15th International Conference, IPCO 2011, New York, NY, USA, June 15-17, 2011. Proceedings 15, pages 301–314. Springer, 2011.
  40. Joseph B Kruskal. Nonmetric multidimensional scaling: a numerical method. Psychometrika, 29(2):115–129, 1964.
  41. Multidimensional scaling. Sage, 1978.
  42. Metric structures in l1: dimension, snowflakes, and average distortion. European Journal of Combinatorics, 26(8):1180–1190, 2005.
  43. Jiří Matoušek. Bi-lipschitz embeddings into low-dimensional euclidean spaces. Commentationes Mathematicae Universitatis Carolinae, 31(3):589–600, 1990.
  44. Jiří Matoušek. On the distortion required for embedding finite metric spaces into normed spaces. Israel Journal of Mathematics, 93(1):333–344, 1996.
  45. Learning nonsingular phylogenies and hidden markov models. In STOC, pages 366–375, 2005.
  46. A birthday repetition theorem and complexity of approximating dense csps. arXiv preprint arXiv:1607.02986, 2016.
  47. Yet another algorithm for dense max cut: go greedy. In SODA, volume 8, pages 176–182, 2008.
  48. Inapproximability for metric embeddings into r^d. In 49th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2008, October 25-28, 2008, Philadelphia, PA, USA, pages 405–413. IEEE Computer Society, 2008.
  49. Sherali-adams relaxations of the matching polytope. In Michael Mitzenmacher, editor, Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009, pages 293–302. ACM, 2009.
  50. Circular partitions with applications to visualization and embeddings. In Proceedings of the twenty-fourth annual symposium on Computational geometry, pages 28–37, 2008.
  51. Yuri Rabinovich. On average distortion of embedding metrics into the line. Discrete & Computational Geometry, 39(4):720–733, 2008.
  52. A hierarchy of relaxations between the continuous and convex hull representations for zero-one programming problems. SIAM Journal on Discrete Mathematics, 3(3):411–430, 1990.
  53. Approximation algorithms for low-distortion embeddings into low-dimensional spaces. SIAM Journal on Discrete Mathematics, 33(1):454–473, 2019.
  54. Metric embeddings with outliers. In SODA, pages 670–689, 2017.
  55. Warren S Torgerson. Multidimensional scaling: I. theory and method. Psychometrika, 17(4):401–419, 1952.
  56. Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
  57. Ulrike Von Luxburg. A tutorial on spectral clustering. Statistics and computing, 17:395–416, 2007.
  58. Eric W Weisstein. Legendre duplication formula. From MathWorld. http://mathworld. wolfram. com/LegendreDuplicationFormula. html, 2019.
  59. Forrest W Young. Multidimensional scaling: History, theory, and applications. Psychology Press, 2013.
  60. Approximation schemes via sherali-adams hierarchy for dense constraint satisfaction problems and assignment problems. In Proceedings of the 5th conference on Innovations in theoretical computer science, pages 423–438, 2014.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com