Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

When Is Inductive Inference Possible? (2312.00170v2)

Published 30 Nov 2023 in cs.LG

Abstract: Can a physicist make only a finite number of errors in the eternal quest to uncover the law of nature? This millennium-old philosophical problem, known as inductive inference, lies at the heart of epistemology. Despite its significance to understanding human reasoning, a rigorous justification of inductive inference has remained elusive. At a high level, inductive inference asks whether one can make at most finite errors amidst an infinite sequence of observations, when deducing the correct hypothesis from a given hypothesis class. Historically, the only theoretical guarantee has been that if the hypothesis class is countable, inductive inference is possible, as exemplified by Solomonoff induction for learning Turing machines. In this paper, we provide a tight characterization of inductive inference by establishing a novel link to online learning theory. As our main result, we prove that inductive inference is possible if and only if the hypothesis class is a countable union of online learnable classes, potentially with an uncountable size, no matter the observations are adaptively chosen or iid sampled. Moreover, the same condition is also sufficient and necessary in the agnostic setting, where any hypothesis class meeting this criterion enjoys an $\tilde{O}(\sqrt{T})$ regret bound for any time step $T$, while others require an arbitrarily slow rate of regret. Our main technical tool is a novel non-uniform online learning framework, which may be of independent interest.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Strong minimax lower bounds for learning. In Proceedings of the ninth annual conference on Computational learning theory, pages 303–309, 1996.
  2. Gambling in a rigged casino: The adversarial multi-armed bandit problem. In Proceedings of IEEE 36th annual foundations of computer science, pages 322–331. IEEE, 1995.
  3. Agnostic online learning. In COLT, volume 3, page 1, 2009.
  4. Nonuniform learnability. In Automata, Languages and Programming: 15th International Colloquium Tampere, Finland, July 11–15, 1988 Proceedings 15, pages 82–92. Springer, 1988.
  5. Moise Blanchard. Universal online learning: An optimistically universal learning rule. In Conference on Learning Theory, pages 1077–1125. PMLR, 2022.
  6. Universal online learning with unbounded losses: Memory is all you need. In International Conference on Algorithmic Learning Theory, pages 107–127. PMLR, 2022.
  7. Occam’s razor. Information processing letters, 24(6):377–380, 1987.
  8. Learnability and the vapnik-chervonenkis dimension. Journal of the ACM (JACM), 36(4):929–965, 1989.
  9. A theory of universal learning. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 532–541, 2021.
  10. Adaptive subgradient methods for online learning and stochastic optimization. Journal of machine learning research, 12(7), 2011.
  11. Steve Hanneke. Learning whenever learning is possible: Universal learning under general stochastic processes. The Journal of Machine Learning Research, 22(1):5751–5866, 2021.
  12. Universal bayes consistency in metric spaces. In 2020 Information Theory and Applications Workshop (ITA), pages 1–33. IEEE, 2020.
  13. Reliable reasoning: Induction and statistical learning theory. MIT press, 2012.
  14. Logarithmic regret algorithms for online convex optimization. Machine Learning, 69:169–192, 2007.
  15. Elad Hazan et al. Introduction to online convex optimization. Foundations and Trends® in Optimization, 2(3-4):157–325, 2016.
  16. Tracking the best expert. Machine learning, 32:151–178, 1998.
  17. Adaptive online prediction by following the perturbed leader. 2005.
  18. Efficient algorithms for online decision problems. Journal of Computer and System Sciences, 71(3):291–307, 2005.
  19. Nick Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine learning, 2:285–318, 1988.
  20. The weighted majority algorithm. Information and computation, 108(2):212–261, 1994.
  21. David Miller. Critical rationalism: A restatement and defence. Open court, 2015.
  22. Tom M Mitchell. Machine learning, 1997.
  23. Understanding machine learning: From theory to algorithms. Cambridge university press, 2014.
  24. Ray Solomonoff. Complexity-based induction systems: comparisons and convergence theorems. IEEE transactions on Information Theory, 24(4):422–432, 1978.
  25. Ray J Solomonoff. A formal theory of inductive inference. part i. Information and control, 7(1):1–22, 1964.
  26. Ray J Solomonoff. A formal theory of inductive inference. part ii. Information and control, 7(2):224–254, 1964.
  27. Tom F Sterkenburg. Statistical learning theory and occam’s razor: The argument from empirical risk minimization. 2023.
  28. The no-free-lunch theorems of supervised learning. Synthese, 199(3-4):9979–10015, 2021.
  29. Charles J Stone. Consistent nonparametric regression. The annals of statistics, pages 595–620, 1977.
  30. Leslie G Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134–1142, 1984.
  31. On the uniform convergence of relative frequencies of events to their probabilities. In Measures of complexity: festschrift for alexey chervonenkis, pages 11–30. Springer, 2015.
  32. Statistical learning theory: Models, concepts, and results. In Handbook of the History of Logic, volume 10, pages 651–706. Elsevier, 2011.
  33. Online learning in dynamically changing environments. arXiv preprint arXiv:2302.00103, 2023.
  34. Non-uniform consistency of online learning with random sampling. In Algorithmic Learning Theory, pages 1265–1285. PMLR, 2021.
  35. Martin Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th international conference on machine learning (icml-03), pages 928–936, 2003.

Summary

We haven't generated a summary for this paper yet.