Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Four Facets of Forecast Felicity: Calibration, Predictiveness, Randomness and Regret (2401.14483v2)

Published 25 Jan 2024 in cs.LG and stat.ML

Abstract: Machine learning is about forecasting. Forecasts, however, obtain their usefulness only through their evaluation. Machine learning has traditionally focused on types of losses and their corresponding regret. Currently, the machine learning community regained interest in calibration. In this work, we show the conceptual equivalence of calibration and regret in evaluating forecasts. We frame the evaluation problem as a game between a forecaster, a gambler and nature. Putting intuitive restrictions on gambler and forecaster, calibration and regret naturally fall out of the framework. In addition, this game links evaluation of forecasts to randomness of outcomes. Random outcomes with respect to forecasts are equivalent to good forecasts with respect to outcomes. We call those dual aspects, calibration and regret, predictiveness and randomness, the four facets of forecast felicity.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (108)
  1. Infinite dimensional analysis: a hitchhiker’s guide. Springer, 3rd edition, 2006.
  2. Introduction to imprecise probabilities. John Wiley & Sons, 2014.
  3. John Langshaw Austin. How to do things with words. Oxford university press, 1975.
  4. On the history of martingales in the study of randomness. Electronic Journal for History of Probability and Statistics, 5(1):1–40, 2009.
  5. Kim C. Border. Lecture notes on convex analysis and economic theory, topic 20: When are sums closed?, 2019-2020.
  6. Glenn W. Brier. Verification of forecasts expressed in terms of probability. Monthly weather review, 78(1):1–3, 1950.
  7. The geometry of mixability. Transactions on Machine Learning Research, 2023. ISSN 2835-8856.
  8. Loss-aggregation in learning under expert advice. Forthcoming, 2024.
  9. Anytime-valid sequential testing for elicitable functionals via supermartingales. arXiv preprint arXiv:2204.05680, 2022.
  10. Prediction, learning, and games. Cambridge university press, 2006.
  11. Gregory J. Chaitin. On the length of programs for computing finite binary sequences. Journal of the ACM (JACM), 13(4):547–569, 1966.
  12. Alexandra Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 5(2):153–163, 2017.
  13. Alonzo Church. On the concept of a random sequence. Bulletin of the American Mathematical Society, 46(2):130–135, 1940.
  14. A. Philip Dawid. The well-calibrated Bayesian. Journal of the American Statistical Association, 77(379):605–610, 1982.
  15. A. Philip Dawid. Statistical theory: The prequential approach (with discussion and rejoinder). Journal of the Royal Statistical Society Ser. A, 147:278–292, 1984.
  16. A. Philip Dawid. Calibration-based empirical probability. The Annals of Statistics, 13(4):1251–1274, 1985.
  17. A. Philip Dawid. The geometry of proper scoring rules. Annals of the Institute of Statistical Mathematics, 59:77–93, 2007.
  18. A. Philip Dawid. On individual risk. Synthese, 194(9):3445–3474, 2017.
  19. Prequential probability: principles and properties. Bernoulli, pages 125–162, 1999.
  20. Randomness and imprecision: a discussion of recent results. In International Symposium on Imprecise Probability: Theories and Applications, pages 110–121. PMLR, 2021.
  21. Bruno De Finetti. Theory of probability: A critical introductory treatment, volume 6. John Wiley & Sons, 1970/2017. Previous edition first published in 1970, Giulio Einaudi, Teoria Delle Probabilita – Bruno de Finetti.
  22. Gerard Debreu. Theory of value: An axiomatic analysis of economic equilibrium. Yale University Press, 1959.
  23. The comparison and evaluation of forecasters. Journal of the Royal Statistical Society: Series D (The Statistician), 32(1-2):12–22, 1983.
  24. Happymap: A generalized multicalibration method. In Yael Tauman Kalai, editor, 14th Innovations in Theoretical Computer Science Conference, ITCS 2023, Leibniz International Proceedings in Informatics, LIPIcs, Germany, January 2023. Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing.
  25. Outcome indistinguishability. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 1095–1108, 2021.
  26. From pseudorandomness to multi-group fairness and back. In The Thirty Sixth Annual Conference on Learning Theory, pages 3566–3614. PMLR, 2023.
  27. Antony Eagle. Chance versus Randomness. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, Spring 2021 edition, 2021.
  28. What is the best risk measure in practice? a comparison of standard measures. Journal of Risk, 18(2):31–60, 2015.
  29. Mixability in statistical learning. Advances in Neural Information Processing Systems, 25, 2012.
  30. Pablo I. Fierens. An extension of chaotic probability models to real-valued variables. International journal of approximate reasoning, 50(4):627–641, 2009.
  31. The complexity of forecast testing. Econometrica, 77(1):93–105, 2009.
  32. Asymptotic calibration. Biometrika, 85(2):379–390, 1998.
  33. Regret in the on-line decision problem. Games and Economic Behavior, 29(1-2):7–35, 1999.
  34. Paul L. Franco. Speech act theory and the multiple aims of science. Philosophy of Science, 86(5):1005–1015, 2019.
  35. Risk measures and upper probabilities: Coherence and stratification. arXiv preprint arXiv:2206.03183, 2022.
  36. Memoryless sequences for differentiable losses. In Conference on Learning Theory, pages 925–939. PMLR, 2017.
  37. Memoryless sequences for general losses. The Journal of Machine Learning Research, 21(1):3070–3097, 2020.
  38. Peter Gács. Uniform test of algorithmic randomness over a general space. Theoretical Computer Science, 341(1-3):91–137, 2005.
  39. Multicalibration as boosting for regression. arXiv preprint arXiv:2301.13767, 2023.
  40. Tilmann Gneiting. Making and evaluating point forecasts. Journal of the American Statistical Association, 106(494):746–762, 2011.
  41. Strictly proper scoring rules, prediction, and estimation. Journal of the American statistical Association, 102(477):359–378, 2007.
  42. Low-degree multicalibration. In Conference on Learning Theory, pages 3193–3234. PMLR, 2022.
  43. Swap agnostic learning, or characterizing omniprediction via multicalibration. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  44. Safe testing. In 2020 Information Theory and Applications Workshop (ITA), pages 1–54. IEEE, 2020.
  45. Game theory, maximum entropy, minimum discrepancy and robust Bayesian decision theory. The Annals of Statistics, 32(4):1367 – 1433, 2004.
  46. Faster online calibration without randomization: interval forecasts and the power of two choices. In Conference on Learning Theory, pages 4283–4309. PMLR, 2022.
  47. On-demand sampling: Learning optimally from multiple distributions. Advances in Neural Information Processing Systems, 35:406–419, 2022.
  48. A unifying perspective on multi-calibration: Game dynamics for multi-objective learning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  49. Paul R. Halmos. Measure theory. Springer, 2013.
  50. Performative power. Advances in Neural Information Processing Systems, 35:22969–22981, 2022.
  51. Multicalibration: Calibration for the (computationally-identifiable) masses. In International Conference on Machine Learning, pages 1939–1948. PMLR, 2018.
  52. On the richness of calibration. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, pages 1124–1138, 2023.
  53. Moment multicalibration for uncertainty estimation. In Conference on Learning Theory, pages 2634–2678. PMLR, 2021.
  54. Reinforcement learning: A survey. Journal of artificial intelligence research, 4:237–285, 1996.
  55. Yuri Kalnishkan. The Aggregating Algorithm and Predictive Complexity. PhD thesis, Department of Computer Science, Royal Holloway, University of London, Egham, 2002.
  56. Mixability and the existence of weak complexities. In International Conference on Computational Learning Theory, pages 105–120. Springer, 2002.
  57. A criterion for the existence of predictive complexity for binary games. In Algorithmic Learning Theory: 15th International Conference, ALT 2004, Padova, Italy, October 2-5, 2004. Proceedings, pages 249–263. Springer, 2004.
  58. Toward efficient agnostic learning. In Proceedings of the fifth annual workshop on Computational learning theory, pages 341–352, 1992.
  59. U-calibration: Forecasting for an unknown agent. In The Thirty Sixth Annual Conference on Learning Theory, pages 5143–5145. PMLR, 2023.
  60. Andreĭ Nikolaevich Kolmogorov. Three approaches to the quantitative definition of information. Problems of information transmission, 1(1):1–7, 1965.
  61. Jason Konek. Evaluating imprecise forecasts. In International Symposium on Imprecise Probability: Theories and Applications, pages 270–279. PMLR, 2023.
  62. Ehud Lehrer. Any inspection is manipulable. Econometrica, 69(5):1333–1347, 2001.
  63. Leonid Anatolevich Levin. The concept of a random sequence. Doklady Akademii Nauk, 212(3):548–550, 1973.
  64. Per Martin-Löf. The definition of random sequences. Information and control, 9(6):602–619, 1966.
  65. From stochastic mixability to fast rates. Advances in Neural Information Processing Systems, 27, 2014.
  66. Mathematical metaphysics of randomness. Theoretical Computer Science, 207(2):263–317, 1998.
  67. Verification of probabilistic predictions: A brief review. Journal of Applied Meteorology and Climatology, 6(5):748–755, 1967.
  68. The statistical scope of multicalibration. In International Conference on Machine Learning, pages 26283–26310. PMLR, 2023.
  69. High-dimensional prediction for sequential decision making. arXiv preprint arXiv:2310.17651, 2023.
  70. Kent Harold Osband. Providing Incentives for Better Cost Forecasting. PhD thesis, University of California, Berkeley, 1985.
  71. Performative prediction. In International Conference on Machine Learning, pages 7599–7609. PMLR, 2020.
  72. Christopher P. Porter. Mathematical and philosophical perspectives on algorithmic randomness. University of Notre Dame, 2012.
  73. Hilary Putnam. The Meaning of the Concept of Probability in Application to Finite Sequences (Routledge Revivals). Routledge, 1950/2013. reprint of Hilary Putnam’s dissertation thesis, 1950.
  74. Game-theoretic statistics and safe anytime-valid inference. Statistical Science, 38(4):576–601, 2023.
  75. Alvaro Sandroni. The reproducible properties of correct forecasts. International Journal of Game Theory, 32(1):151–159, 2003.
  76. Leonard J. Savage. Elicitation of personal probabilities and expectations. Journal of the American Statistical Association, 66(336):783–801, 1971.
  77. Eric Schechter. Handbook of Analysis and its Foundations. Academic Press, 1997.
  78. Mark J. Schervish. Self-calibrating priors do not exist: Comment. Journal of the American Statistical Association, 80(390):341–342, 1985.
  79. Mark J. Schervish. A general method for comparing probability assessors. The annals of statistics, 17(4):1856–1879, 1989.
  80. Proper scoring rules, dominated forecasts, and coherence. Decision Analysis, 6(4):202–221, 2009.
  81. Claus-Peter Schnorr. The process complexity and effective random tests. In Proceedings of the fourth annual ACM symposium on Theory of computing, pages 168–176, 1972.
  82. Teddy Seidenfeld. Calibration, coherence, and scoring rules. Philosophy of Science, 52(2):274–294, 1985.
  83. Glenn Shafer. Testing by betting: A strategy for statistical and scientific communication. Journal of the Royal Statistical Society: Series A (Statistics in Society), 184(2):407–431, 2021.
  84. Game-theoretic foundations for probability and finance. John Wiley & Sons, 2019.
  85. Elicitation and identification of properties. In Conference on Learning Theory, pages 482–526. PMLR, 2014.
  86. John Toland. L-infinity and Its Dual, pages 27–29. Springer International Publishing, 2020.
  87. Matthias C. M. Troffaes. Decision making under uncertainty using imprecise probabilities. International journal of approximate reasoning, 45(1):17–29, 2007.
  88. Can an individual sequence of zeros and ones be random? Russian Mathematical Surveys, 45(1):121, 1990.
  89. Jean Ville. Étude critique de la notion de collectif. Gauthier-Villars, 1939.
  90. Richard von Mises. Grundlagen der Wahrscheinlichkeitsrechnung. Mathematische Zeitschrift, 5(1):52–99, 1919.
  91. Vladimir Vovk. Aggregating strategies. In Proceedings of 3rd Annu. Workshop on Comput. Learning Theory, pages 371–383, 1990.
  92. Vladimir Vovk. Defensive prediction with expert advice. In International Conference on Algorithmic Learning Theory, pages 444–458. Springer, 2005.
  93. Vladimir Vovk. The fundamental nature of the log loss function. arXiv, February 2015 Version, 2015.
  94. Vladimir Vovk. Non-algorithmic theory of randomness. In Fields of Logic and Computation III: Essays Dedicated to Yuri Gurevich on the Occasion of His 80th Birthday, pages 323–340. Springer, 2020.
  95. Prequential randomness and probability. Theoretical Computer Science, 411(29-30):2632–2646, 2010.
  96. Algorithmic learning in a random world. Springer, 2005a.
  97. Defensive forecasting. In International Workshop on Artificial Intelligence and Statistics, pages 365–372. PMLR, 2005b.
  98. Volodya Vovk. Competitive on-line statistics. International Statistical Review, 69(2):213–248, 2001.
  99. Universal portfolio selection. In Proceedings of the eleventh annual conference on Computational learning theory, pages 12–23, 1998.
  100. Abraham Wald. Die Widerspruchsfreiheit des Kollektivbegriffs. Ergebnisse eines mathematischen Kolloquiums, 8:38–72, 1937.
  101. Peter Walley. Statistical reasoning with imprecise probabilities. Chapman and Hall, 1991.
  102. Peter Walley. Towards a unified theory of imprecise probability. International Journal of Approximate Reasoning, 24(2-3):125–148, 2000.
  103. Peter M. Williams. Notes on conditional previsions. International Journal of Approximate Reasoning, 44(3):366–383, 2007. revised version of: Notes on conditional previsions. Research Report, School of Math. and Phys. Science, University of Sussex, 1975.
  104. The geometry and calculus of losses. Journal of Machine Learning Research, 24(342):1–72, 2023.
  105. Composite multiclass losses. Journal of Machine Learning Research, 17(222):1–52, 2016.
  106. Right decisions from wrong predictions: A mechanism design alternative to individual calibration. In International Conference on Artificial Intelligence and Statistics, pages 2683–2691. PMLR, 2021.
  107. Calibrating predictions to decisions: A novel approach to multi-class calibration. Advances in Neural Information Processing Systems, 34:22313–22324, 2021.
  108. The complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms. Russian Mathematical Surveys, 25(6):83, 1970.
Citations (1)

Summary

We haven't generated a summary for this paper yet.