Dual VC Dimension Obstructs Sample Compression by Embeddings (2405.17120v1)
Abstract: This work studies embedding of arbitrary VC classes in well-behaved VC classes, focusing particularly on extremal classes. Our main result expresses an impossibility: such embeddings necessarily require a significant increase in dimension. In particular, we prove that for every $d$ there is a class with VC dimension $d$ that cannot be embedded in any extremal class of VC dimension smaller than exponential in $d$. In addition to its independent interest, this result has an important implication in learning theory, as it reveals a fundamental limitation of one of the most extensively studied approaches to tackling the long-standing sample compression conjecture. Concretely, the approach proposed by Floyd and Warmuth entails embedding any given VC class into an extremal class of a comparable dimension, and then applying an optimal sample compression scheme for extremal classes. However, our results imply that this strategy would in some cases result in a sample compression scheme at least exponentially larger than what is predicted by the sample compression conjecture. The above implications follow from a general result we prove: any extremal class with VC dimension $d$ has dual VC dimension at most $2d+1$. This bound is exponentially smaller than the classical bound $2{d+1}-1$ of Assouad, which applies to general concept classes (and is known to be unimprovable for some classes). We in fact prove a stronger result, establishing that $2d+1$ upper bounds the dual Radon number of extremal classes. This theorem represents an abstraction of the classical Radon theorem for convex sets, extending its applicability to a wider combinatorial framework, without relying on the specifics of Euclidean convexity. The proof utilizes the topological method and is primarily based on variants of the Topological Radon Theorem.
- Point selections and weak ε𝜀\varepsilonitalic_ε-nets for convex hulls. Combinatorics, Probability and Computing, 1(03):189–200, 1992.
- Sign rank versus Vapnik-Chervonenkis dimension. Sbornik: Mathematics, 208(12):1724, 2017.
- Patrick Assouad. Densité et dimension. Annales de l’Institut Fourier (Grenoble), 33(3):233–282, 1983.
- On a common generalization of Borsuk’s and Radon’s theorem. Acta Mathematica Academiae Scientiarum Hungarica, 34(3):347–350, 1979. doi: 10.1007/BF01896131. URL https://doi.org/10.1007/BF01896131.
- On the number of halving planes. Combinatorica, 10(2):175–183, 1990.
- Combinatorial Variability of Vapnik-Chervonenkis Classes with Applications to Sample Compression Schemes. Discrete Applied Mathematics, 86(1):3–25, 1998. doi: 10.1016/S0166-218X(98)00000-6. URL http://dx.doi.org/10.1016/S0166-218X(98)00000-6.
- Defect Sauer results. Journal of Combinatorial Theory, Series A, 72(2):189–208, 1995.
- Reverse Kleitman Inequalities. Proceedings of the London Mathematical Society, s3-58(1):153–168, 01 1989. ISSN 0024-6115. doi: 10.1112/plms/s3-58.1.153. URL https://doi.org/10.1112/plms/s3-58.1.153.
- Proper learning, Helly number, and an optimal SVM bound. In Proceedings of the 33rdsuperscript33rd33^{{\rm rd}}33 start_POSTSUPERSCRIPT roman_rd end_POSTSUPERSCRIPT Conference on Learning Theory, 2020.
- Unlabeled sample compression schemes and corner peelings for ample and maximum classes. J. Comput. Syst. Sci., 127:1–28, 2022. doi: 10.1016/J.JCSS.2022.01.003. URL https://doi.org/10.1016/j.jcss.2022.01.003.
- Sample compression schemes for balls in graphs. SIAM J. Discret. Math., 37(4):2585–2616, 2023. doi: 10.1137/22M1527817. URL https://doi.org/10.1137/22m1527817.
- Two-dimensional partial cubes. The Electronic Journal of Combinatorics, pages 3–29, 2020.
- Labeled sample compression schemes for complexes of oriented matroids. CoRR, abs/2110.15168, 2021. URL https://arxiv.org/abs/2110.15168.
- Ample completions of oriented matroids and complexes of uniform oriented matroids. SIAM Journal of Discrete Mathematics, 36(1):509–535, 2022.
- Bogdan Chornomaz. What convex geometries tell about shattering-extremal systems. The Electronic Journal of Combinatorics, pages P3–40, 2022.
- Helly’s Theorem and Its Relatives. Proceedings of symposia in pure mathematics: Convexity. American Mathematical Society, 1963. URL https://books.google.com/books?id=I1l5HAAACAAJ.
- Andreas W. M. Dress. Towards a theory of holistic clustering. In Boris G. Mirkin, Fred R. McMorris, Fred S. Roberts, and Andrey Rzhetsky, editors, Mathematical Hierarchies and Biology, Proceedings of a DIMACS Workshop, November 13-15, 1996, volume 37 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pages 271–290. DIMACS/AMS, 1996. doi: 10.1090/DIMACS/037/19. URL https://doi.org/10.1090/dimacs/037/19.
- Sample compression, learnability, and the Vapnik-Chervonenkis dimension. Machine Learning, 21(3):269–304, 1995.
- Vapnik-Chervonenkis dimension and (pseudo-) hyperplane arrangements. Discrete Comput. Geom., 12(4):399–432, 1994. doi: 10.1007/BF02574389.
- G𝐺Gitalic_G-coincidences for maps of homotopy spheres into CW-complexes. Proceedings of the American Mathematical Society, 130(10):3111–3115, 2002.
- Craig R. Guilbault. An elementary deduction of the Topological Radon Theorem from Borsuk–Ulam. Discrete & Computational Geometry, 43:951–954, 2010.
- Eduard Helly. Über mengen konvexer körper mit gemeinschaftlichen punkte. Jahresbericht der Deutschen Mathematiker-Vereinigung, 32:175–176, 1923. URL http://eudml.org/doc/145659.
- Radon numbers and the fractional Helly theorem. Israel Journal of Mathematics, 241(1):433–447, 2021.
- Antipodal coincidence for maps of spheres into complexes. In preprint (Lecture at the Conference on Topological Fixed Point Theory and its Applications in Torun, 1993.
- Axiomatic convexity theory and relationships between the Carathéodory, Helly, and Radon numbers. Pacific Journal of Mathematics, 38(2):471–485, 1971.
- Unlabeled compression schemes for maximum classes. Journal of Machine Learning Research, 8:2047–2081, 2007. URL http://dl.acm.org/citation.cfm?id=1314566.
- James F. Lawrence. Lopsided sets and orthant-intersection of convex sets. Pacific J. Math., 104:155–173, 1983.
- Subdivisions and triangulations of polytopes. In Handbook of discrete and computational geometry, pages 415–447. Chapman and Hall/CRC, 2017.
- Friedrich W. Levi. On Helly’s theorem and the axioms of convexity. J. Indian Math. Soc, 15:65–76, 1951.
- Relating data compression and learnability. Unpublished manuscript, 1986.
- Using the Borsuk-Ulam theorem: lectures on topological methods in combinatorics and geometry, volume 2003. Springer, 2003.
- Shay Moran. Shattering-extremal systems. arXiv preprint, 1211.2980, 2012.
- Labeled compression schemes for extremal classes. In Proceedings of the 27thsuperscript27th27^{{\rm th}}27 start_POSTSUPERSCRIPT roman_th end_POSTSUPERSCRIPT International Conference on Algorithmic Learning Theory, pages 34–49. Springer, 2016.
- Sample compression schemes for VC classes. Journal of the ACM, 63(3):1–10, 2016.
- On weak ϵitalic-ϵ\epsilonitalic_ϵ-nets and the Radon number. Discret. Comput. Geom., 64(4):1125–1140, 2020. doi: 10.1007/S00454-020-00222-Y. URL https://doi.org/10.1007/s00454-020-00222-y.
- Alain Pajor. Sous-espaces ℓ1nsuperscriptsubscriptℓ1𝑛\ell_{1}^{n}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT des Espaces de Banach. Travaux en Cours. Hermann, Paris, 1985.
- Johann Radon. Mengen konvexer körper, die einen gemeinsamen punkt enthalten. Mathematische Annalen, 83(1):113–115, Mar 1921. ISSN 1432-1807. doi: 10.1007/BF01464231. URL https://doi.org/10.1007/BF01464231.
- Benjamin I. P. Rubinstein and J. Hyam Rubinstein. A geometric approach to sample compression. Journal of Machine Learning Research, 13:1221–1261, 2012. URL http://dl.acm.org/citation.cfm?id=2343686.
- Shifting: One-inclusion mistake bounds and sample compression. J. Comput. Syst. Sci., 75(1):37–59, 2009. doi: 10.1016/J.JCSS.2008.07.005. URL https://doi.org/10.1016/j.jcss.2008.07.005.
- Bounding embeddings of VC classes into maximum classes. Measures of Complexity: Festschrift for Alexey Chervonenkis, pages 303–325, 2015.
- Joachim Hyam Rubinstein and Benjamin I. P. Rubinstein. Unlabelled sample compression schemes for intersection-closed classes and extremal classes. In NeurIPS, 2022. URL http://papers.nips.cc/paper_files/paper/2022/hash/54d6a55225cebbdc16fbb0e45c5bdf2b-Abstract-Conference.html.
- Norbert Sauer. On the density of families of sets. Journal of Combinatorial Theory (A), 13(1):145–147, 1972.
- Understanding Machine Learning - From Theory to Algorithms. Cambridge University Press, 2014. ISBN 978-1-10-705713-5.
- Saharon Shelah. A combinatorial problem, stability and order for models and theories in infinitary languages. Pacific J. Math., 41(1):247–261, 1972. doi: 10.2140/pjm.1972.41.247.
- M.L.J. van de Vel. Theory of Convex Structures, volume 50 of North-Holland mathematical library. North-Holland, 1993. ISBN 9780444815057. URL https://books.google.com/books?id=xt9-lAEACAAJ.
- On the uniform convergence of relative frequencies of events to their probabilities. Proc. USSR Acad. Sci., 181(4):781–783, 1968.
- Some generalizations of the Borsuk-Ulam theorem. Plubl. Math. Debrecen, 78:583–593, 2011.
- Manfred K. Warmuth. Compressing to VC dimension many points. In COLT/Kernel, pages 743–744, 2003. doi: 10.1007/978-3-540-45167-9˙60. URL http://dx.doi.org/10.1007/978-3-540-45167-9_60.
- Avi Wigderson. A Theory Revolutionizing Technology and Science. Princeton University Press, Princeton, 2019. ISBN 9780691192543. doi: doi:10.1515/9780691192543. URL https://doi.org/10.1515/9780691192543.