Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 86 tok/s
Gemini 2.5 Pro 60 tok/s Pro
GPT-5 Medium 28 tok/s
GPT-5 High 34 tok/s Pro
GPT-4o 72 tok/s
GPT OSS 120B 441 tok/s Pro
Kimi K2 200 tok/s Pro
2000 character limit reached

Stability via resampling: statistical problems beyond the real line (2405.09511v2)

Published 15 May 2024 in math.ST and stat.TH

Abstract: Model averaging techniques based on resampling methods (such as bootstrapping or subsampling) have been utilized across many areas of statistics, often with the explicit goal of promoting stability in the resulting output. We provide a general, finite-sample theoretical result guaranteeing the stability of bagging when applied to algorithms that return outputs in a general space, so that the output is not necessarily a real-valued -- for example, an algorithm that estimates a vector of weights or a density function. We empirically assess the stability of bagging on synthetic and real-world data for a range of problem settings, including causal inference, nonparametric regression, and Bayesian model selection.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. Abadie, A. (2021). Using synthetic controls: Feasibility, data requirements, and methodological aspects, Journal of Economic Literature 59(2): 391–425.
  2. Synthetic control methods for comparative case studies: estimating the effect of California’s tobacco control program, J. Amer. Statist. Assoc. 105(490): 493–505.
  3. Comparative politics and the synthetic control method, American Journal of Political Science 59(2): 495–510.
  4. The economic costs of conflict: A case study of the basque country, American economic review 93(1): 113–132.
  5. Barber, R. F. (2024). Hoeffding and Bernstein inequalities for weighted sums of exchangeable random variables, arXiv preprint arXiv:2404.06457 .
  6. Stability and generalization, The Journal of Machine Learning Research 2: 499–526.
  7. Breiman, L. (1996a). Bagging predictors, Machine learning 24(2): 123–140.
  8. Breiman, L. (1996b). Heuristics of instability and stabilization in model selection, The Annals of Statistics 24(6): 2350–2383.
  9. Bühlmann, P. (2014). Discussion of big Bayes stories and BayesBag, Statistical science 29(1): 91–94.
  10. Distribution-free inequalities for the deleted and holdout error estimates, IEEE Transactions on Information Theory 25(2): 202–207.
  11. Distribution-free performance bounds for potential function rules, IEEE Transactions on Information Theory 25(5): 601–604.
  12. Folland, G. B. (1999). Real analysis, Pure and Applied Mathematics (New York), second edn, John Wiley & Sons, Inc., New York. Modern techniques and their applications, A Wiley-Interscience Publication.
  13. Hayes, T. P. (2005). A large-deviation inequality for vector-valued martingales, Combinatorics, Probability and Computing .
  14. Reproducible model selection using bagged posteriors, Bayesian Analysis 18(1): 79–104.
  15. Black-box tests for algorithmic stability, Inf. Inference 12(4): Paper No. iaad039, 30. https://doi.org/10.1093/imaiai/iaad039
  16. Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization, Advances in Computational Mathematics 25(1): 161–193.
  17. Definitions, methods, and applications in interpretable machine learning, Proceedings of the National Academy of Sciences 116(44): 22071–22080.
  18. Learnability, stability and uniform convergence, The Journal of Machine Learning Research 11: 2635–2670.
  19. Bagging provides assumption-free stability, arXiv preprint arXiv:2301.12600 .
  20. Vaníček, P. (1969). Approximate spectral analysis by least-squares fit: Successive spectral analysis, Astrophysics and Space Science 4: 387–391.
  21. Sparse algorithms are not stable: A no-free-lunch theorem, IEEE transactions on pattern analysis and machine intelligence 34(1): 187–193.
  22. Yu, B. (2013). Stability, Bernoulli 19(4): 1484–1500.
  23. Veridical Data Science: The Practice of Responsible Data Analysis and Decision Making, MIT Press.
  24. Veridical data science, Proc. Natl. Acad. Sci 117(8): 3920–3929.
Citations (3)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com