Learning Weakly Convex Sets in Metric Spaces (2105.06251v2)
Abstract: One of the central problems studied in the theory of machine learning is the question of whether, for a given class of hypotheses, it is possible to efficiently find a {consistent} hypothesis, i.e., which has zero training error. While problems involving {\em convex} hypotheses have been extensively studied, the question of whether efficient learning is possible for non-convex hypotheses composed of possibly several disconnected regions is still less understood. Although it has been shown quite a while ago that efficient learning of weakly convex hypotheses, a parameterized relaxation of convex hypotheses, is possible for the special case of Boolean functions, the question of whether this idea can be developed into a generic paradigm has not been studied yet. In this paper, we provide a positive answer and show that the consistent hypothesis finding problem can indeed be solved in polynomial time for a broad class of weakly convex hypotheses over metric spaces. To this end, we propose a general domain-independent algorithm for finding consistent weakly convex hypotheses and prove sufficient conditions for its efficiency that characterize the corresponding hypothesis classes. To illustrate our general algorithm and its properties, we discuss several non-trivial learning examples to demonstrate how it can be used to efficiently solve the corresponding consistent hypothesis finding problem. Without the weak convexity constraint, these problems are known to be computationally intractable. We then proceed to show that the general idea of our algorithm can even be extended to the case of extensional weakly convex hypotheses, as it naturally arise, e.g., when performing vertex classification in graphs. We prove that using our extended algorithm, the problem can be solved in polynomial time provided the distances in the domain can be computed efficiently.
- Red blue set cover problem on axis-parallel hyperplanes and other objects. Information Processing Letters, 186:106485, 2024. ISSN 0020-0190. doi: 10.1016/j.ipl.2024.106485.
- On-line learning with malicious noise and the closure algorithm. Annals of Mathematics and Artificial Intelligence, 23(1):83–99, 1998. ISSN 1573-7470. doi: 10.1023/A:1018960107028.
- The class cover problem with boxes. Computational Geometry, 45(7):294–304, 2012. doi: 10.1016/j.comgeo.2012.01.014.
- Learnability and the Vapnik-Chervonenkis dimension. Journal of the ACM, 36(4):929–965, 1989. doi: 10.1145/76359.76371.
- Exact recovery of clusters in finite metric spaces using oracle queries. In Mikhail Belkin and Samory Kpotufe, editors, Proceedings of Thirty Fourth Conference on Learning Theory, volume 134 of Proceedings of Machine Learning Research, pages 775–803, Boulder, Colorado, USA, 15–19 Aug 2021. PMLR.
- Separating points by axis-parallel lines. International Journal of Computational Geometry & Applications, 15(06):575–590, 2005. doi: 10.1142/S0218195905001865.
- Approximation algorithms for the class cover problem. Annals of Mathematics and Artificial Intelligence, 40:215–223, 2004. doi: 10.1023/B:AMAI.0000012867.03976.a5.
- On the red-blue set cover problem. In Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’00, pages 345–353, USA, 2000. Society for Industrial and Applied Mathematics. doi: 10.5555/338219.338271.
- Unlabeled sample compression schemes and corner peelings for ample and maximum classes. Journal of Computer and System Sciences, 127:1–28, 2022. doi: 10.1016/j.jcss.2022.01.003.
- Geometric red–blue set cover for unit squares and related problems. Computational Geometry, 48(5):380–385, 2015. ISSN 0925-7721. doi: 10.1016/j.comgeo.2014.12.005.
- Victor Chepoi. Classification of graphs by means of metric triangles. Metody Diskretnogo Analiza, 96:75–93, 1989.
- Victor Chepoi. Basis graphs of even delta-matroids. Journal of Combinatorial Theory, Series B, 97(2):175–192, 2007. doi: 10.1016/j.jctb.2006.05.003.
- Hypercellular graphs: Partial cubes without Q3- as partial cube minor. Discrete Mathematics, 343(4):111678, 2020. doi: 10.1016/j.disc.2019.111678.
- Introduction to Lattices and Order. Cambridge University Press, Cambridge, UK, 2nd edition, 2002. ISBN 9780511809088. doi: 10.1017/CBO9780511809088.
- The geodesic classification problem on graphs. Electronic Notes in Theoretical Computer Science, 346:65–76, 2019. ISSN 1571-0661. doi: 10.1016/j.entcs.2019.08.007.
- Boris Nikolaevich Delaunay. Sur la sphére vide. Bulletin de l’Académie des Sciences de l’URSS, 6:793–800, 1934.
- Fast detection of polyhedral intersection. Theoretical Computer Science, 27(3):241–253, 1983. ISSN 0304-3975. doi: 10.1016/0304-3975(82)90120-7.
- Herbert Edelsbrunner. Computing the extreme distances between two convex polygons. Journal of Algorithms, 6(2):213–224, 1985. ISSN 0196-6774. doi: 10.1016/0196-6774(85)90039-2.
- On connected boolean functions. Discrete Applied Mathematics, 96-97:337–362, 1999. doi: 10.1016/S0166-218X(99)00098-0.
- Convexity and logical analysis of data. Theoretical Computer Science, 244(1):95 – 116, 2000. ISSN 1879-2294. doi: 10.1016/S0304-3975(98)00337-5.
- Convexity in graphs and hypergraphs. SIAM Journal on Algebraic Discrete Methods, 7(3):433–444, 1986. doi: 10.1137/0607049.
- Peter Gärdenfors. Conceptual spaces: The geometry of thought. MIT Press, Cambridge, Massachusetts, USA, 2000.
- Peter Gärdenfors. The Geometry of Meaning: Semantics Based on Conceptual Spaces. MIT Press, Cambridge, Massachusetts, USA, 2014.
- Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman, New York City, NY, USA, 1979.
- Ronald L. Graham. An efficient algorithm for determining the convex hull of a finite planar set. Information Processing Letters, 1(4):132–133, 1972. doi: 10.1016/0020-0190(72)90045-2.
- Learning nested differences of intersection-closed concept classes. Machine Learning, 5(2):165–196, 1990. ISSN 1573-0565. doi: 10.1007/BF00116036.
- An Introduction to Computational Learning Theory. MIT Press, Cambridge, Massachusetts, USA, 1994.
- Harry Kesten. Percolation Theory for Mathematicians. Progress in probability and statistics. Birkhäuser, Boston, Massachusetts, USA, 1982.
- Donald E. Knuth. The Art of Computer Programming, Volume I: Fundamental Algorithms. Addison-Wesley, Boston, Massachusetts, USA, 3rd edition, 1997.
- On extreme points of regular convex sets. Studia Mathematica, 9:133–138, 1940.
- Nimrod Megiddo. On the complexity of polyhedral separability. Discrete & Computational Geometry, 3(4):325–337, 1988. ISSN 1432-0444. doi: 10.1007/BF02187916.
- Karl Menger. Untersuchungen über allgemeine Metrik. Mathematische Annalen, 100(1):75–163, 1928. ISSN 1432-1807. doi: 10.1007/BF01448840.
- Tom M Mitchell. Generalization as search. Artificial intelligence, 18(2):203–226, 1982.
- Sôji Nakajima. Über konvexe Kurven und Flächen. Tohoku Mathematical Journal, First Series, 29:227–230, 1928.
- Balas K. Natarajan. On learning boolean functions. In Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing, STOC ’87, pages 296–304, New York City, NY, USA, 1987. Association for Computing Machinery. doi: 10.1145/28395.28427.
- Ignacio M. Pelayo. Geodesic Convexity in Graphs. SpringerBriefs in Mathematics. Springer, New York City, NY, USA, 2013. ISBN 978-1-4614-8699-2. doi: 10.1007/978-1-4614-8699-2.
- Computational limitations on learning from examples. Journal of the ACM, 35(4):965–984, October 1988. ISSN 0004-5411. doi: 10.1145/48014.63140.
- Computational Geometry - An Introduction. Texts and Monographs in Computer Science. Springer, New York, NY, 1985. ISBN 3-540-96131-3. doi: 10.1007/978-1-4612-1098-6.
- Maximal closed set and half-space separations in finite closure systems. Theoretical Computer Science, 973:114105, September 2023. ISSN 0304-3975. doi: 10.1016/j.tcs.2023.114105.
- Robin Sibson. SLINK: an optimally efficient algorithm for the single-link cluster method. The Computer Journal, 16(1):30–34, 1973. doi: 10.1093/comjnl/16.1.30.
- Learning weakly convex sets in metric spaces. In Nuria Oliver, Fernando Pérez-Cruz, Stefan Kramer, Jesse Read, and Jose A. Lozano, editors, Machine Learning and Knowledge Discovery in Databases. Research Track, volume 12976 of Lecture Notes in Computer Science, pages 200–216, Cham, 2021. Springer International Publishing. doi: 10.1007/978-3-030-86520-7_13.
- Active learning of convex halfspaces on graphs. In Marc’Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan, editors, Advances in Neural Information Processing Systems, volume 34, pages 23413–23425, Red Hook, NY, USA, 2021. Curran Associates, Inc.
- Online learning of convex sets on graphs. In Massih-Reza Amini, Stéphane Canu, Asja Fischer, Tias Guns, Petra Kralj Novak, and Grigorios Tsoumakas, editors, Machine Learning and Knowledge Discovery in Databases, pages 349–364, Cham, 2022. Springer Nature Switzerland. doi: 10.1007/978-3-031-26412-2_22.
- Heinrich Tietze. Über konvexheit im kleinen und im großen und über gewisse den punkten einer menge zugeordnete dimensionszahlen. Mathematische Zeitschrift, 28(1):697–707, 1928. ISSN 1432-1823. doi: 10.1007/BF01181191.
- Leslie G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134–1142, 1984. ISSN 0001-0782. doi: 10.1145/1968.1972.
- Marcel L. J. van de Vel. Theory of Convex Structures, volume 50 of North-Holland Mathematical Library. North-Holland, Amsterdam, 1993. ISBN 9780444815057.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.