Transductive Learning Is Compact (2402.10360v3)
Abstract: We demonstrate a compactness result holding broadly across supervised learning with a general class of loss functions: Any hypothesis class $H$ is learnable with transductive sample complexity $m$ precisely when all of its finite projections are learnable with sample complexity $m$. We prove that this exact form of compactness holds for realizable and agnostic learning with respect to any proper metric loss function (e.g., any norm on $\mathbb{R}d$) and any continuous loss on a compact space (e.g., cross-entropy, squared loss). For realizable learning with improper metric losses, we show that exact compactness of sample complexity can fail, and provide matching upper and lower bounds of a factor of 2 on the extent to which such sample complexities can differ. We conjecture that larger gaps are possible for the agnostic case. Furthermore, invoking the equivalence between sample complexities in the PAC and transductive models (up to lower order factors, in the realizable case) permits us to directly port our results to the PAC model, revealing an almost-exact form of compactness holding broadly in PAC learning.
- Computable pac learning of continuous features. In Proceedings of the 37th Annual ACM/IEEE Symposium on Logic in Computer Science, pages 1–12, 2022.
- The one-inclusion graph algorithm is not always optimal. In The Thirty Sixth Annual Conference on Learning Theory, pages 72–88. PMLR, 2023a.
- Optimal pac bounds without uniform convergence. In 2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS), pages 1203–1223. IEEE Computer Society, 2023b.
- On learnability wih computable learners. In Algorithmic Learning Theory, pages 48–60. PMLR, 2020.
- Scale-sensitive dimensions, uniform convergence, and learnability. Journal of the ACM (JACM), 44(4):615–631, 1997.
- A theory of pac learnability of partial concept classes. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pages 658–671. IEEE, 2022.
- Regularization and optimal multiclass learning. arXiv preprint arXiv:2309.13692, 2023.
- Optimal learners for realizable regression: PAC learning and online learning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- Fat-shattering and the learnability of real-valued functions. In Proceedings of the seventh annual conference on Computational learning theory, pages 299–310, 1994.
- Learnability can be undecidable. Nature Machine Intelligence, 1(1):44–48, 2019.
- Learnability with respect to fixed distributions. Theoretical Computer Science, 86(2):377–389, 1991.
- Achilles A Beros. Learning theory in the arithmetic hierarchy. The Journal of Symbolic Logic, 79(3):908–927, 2014.
- Learnability and the vapnik-chervonenkis dimension. Journal of the ACM (JACM), 36(4):929–965, 1989.
- A characterization of multiclass learnability. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 943–955. IEEE, 2022.
- Wesley Calvert. Pac learning, vc dimension, and the arithmetic hierarchy. Archive for Mathematical Logic, 54(7-8):871–883, 2015.
- Matthias C Caro. From undecidability of non-triviality and finiteness to undecidability of learnability. International Journal of Approximate Reasoning, 163:109057, 2023.
- Paul J Cohen. The independence of the continuum hypothesis. Proceedings of the National Academy of Sciences, 50(6):1143–1148, 1963.
- Optimal learners for multiclass problems. In Conference on Learning Theory, pages 287–316. PMLR, 2014.
- On statistical learning via the lens of compression. In Proceedings of the 30th International Conference on Neural Information Processing Systems, pages 2792–2800, 2016.
- Find a witness or shatter: the landscape of computable pac learning. In The Thirty Sixth Annual Conference on Learning Theory, pages 511–524. PMLR, 2023.
- A course on empirical processes. In Ecole d’été de probabilités de Saint-Flour XII-1982, pages 1–142. Springer, 1984.
- Kurt Gödel. Über formal unentscheidbare sätze der principia mathematica und verwandter systeme i. Monatshefte für mathematik und physik, 38:173–198, 1931.
- P. Hall. On representatives of subsets. Journal of the London Mathematical Society, s1-10(1):26–30, 1935. doi: https://doi.org/10.1112/jlms/s1-10.37.26.
- Marshall Hall Jr. Distinct representatives of subsets. Bulletin of the American Mathematical Society, 54(10):922–926, 1948.
- The marriage problem. American Journal of Mathematics, 72(1):214–215, 1950.
- Bandit learnability can be undecidable. In The Thirty Sixth Annual Conference on Learning Theory, pages 5813–5849. PMLR, 2023.
- Klaas Pieter Hart. Machine learning and the continuum hypothesis. arXiv preprint arXiv:1901.04773, 2019.
- On computable online learning. In International Conference on Algorithmic Learning Theory, pages 707–725. PMLR, 2023.
- Predicting {{\{{0, 1}}\}}-functions on randomly drawn points. Information and Computation, 115(2):248–292, 1994.
- Richard H Lathrop. On the learnability of the uncomputable. In ICML, pages 302–309, 1996.
- Tosca Lechner et al. Impossibility of characterizing distribution learning–a simple solution to a long-standing problem. arXiv preprint arXiv:2304.08712, 2023.
- David Pollard. Convergence of stochastic processes. Springer Science & Business Media, 2012.
- Bjorn Poonen. Undecidable problems: a sampler. Interpreting Gödel: Critical Essays, pages 211–241, 2014.
- T Przymusiński and F Tall. The undecidability of the existence of a non-separable normal moore space satisfying the countable chain condition. Fundamenta Mathematicae, 85:291–297, 1974.
- Richard Rado. Note on the transfinite case of hall’s theorem on representatives. J. London Math. Soc, 42:321–324, 1967.
- Marcus Schaefer. Deciding the vapnik–červonenkis dimension is Σ3psubscriptsuperscriptΣ𝑝3\Sigma^{p}_{3}roman_Σ start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT-complete. Journal of Computer and System Sciences, 58(1):177–182, 1999.
- Understanding machine learning: From theory to algorithms. Cambridge university press, 2014.
- Saharon Shelah. Infinite abelian groups, whitehead problem and some constructions. Israel Journal of Mathematics, 18:243–256, 1974.
- Tom F Sterkenburg. On characterizations of learnability with computable learners. In Conference on Learning Theory, pages 3365–3379. PMLR, 2022.
- Leslie G Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134–1142, 1984.
- Constructing metrics with the heine-borel property. Proceedings of the American Mathematical Society, 100(3):567–573, 1987.