Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Divide, Conquer, Combine Bayesian Decision Tree Sampling (2403.18147v1)

Published 26 Mar 2024 in cs.LG

Abstract: Decision trees are commonly used predictive models due to their flexibility and interpretability. This paper is directed at quantifying the uncertainty of decision tree predictions by employing a Bayesian inference approach. This is challenging because these approaches need to explore both the tree structure space and the space of decision parameters associated with each tree structure. This has been handled by using Markov Chain Monte Carlo (MCMC) methods, where a Markov Chain is constructed to provide samples from the desired Bayesian estimate. Importantly, the structure and the decision parameters are tightly coupled; small changes in the tree structure can demand vastly different decision parameters to provide accurate predictions. A challenge for existing MCMC approaches is proposing joint changes in both the tree structure and the decision parameters that result in efficient sampling. This paper takes a different approach, where each distinct tree structure is associated with a unique set of decision parameters. The proposed approach, entitled DCC-Tree, is inspired by the work in Zhou et al. [23] for probabilistic programs and Cochrane et al. [4] for Hamiltonian Monte Carlo (HMC) based sampling for decision trees. Results show that DCC-Tree performs comparably to other HMC-based methods and better than existing Bayesian tree methods while improving on consistency and reducing the per-proposal complexity.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. M. Betancourt. A Conceptual Introduction to Hamiltonian Monte Carlo. arXiv preprint arXiv:1701.02434, 2017.
  2. Classification and regression trees. Wadsworth Int. Group, 37(15):237–251, 1984.
  3. Bayesian CART Model Search. Journal of the American Statistical Association, 93(443):935–948, 1998.
  4. RJHMC-Tree for Exploration of the Bayesian Decision Tree Posterior. arXiv preprint arXiv:2312.01577, 2023.
  5. A Bayesian CART algorithm. Biometrika, 85(2):363–377, 1998.
  6. D. Dua and C. Graff. "UCI Machine Learning Repository", 2017. URL http://archive.ics.uci.edu/ml.
  7. Z. Ghahramani. Probabilistic machine learning and artificial intelligence. Nature, 521(7553):452–459, 2015.
  8. Bayesian Treed Gaussian Process Models with an Application to Computer Modeling. Journal of the American Statistical Association, 103(483):1119–1130, 2008.
  9. W. K. Hastings. Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57(1):97–109, 1970.
  10. M. D. Hoffman and A. Gelman. The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1):1593–1623, 2014.
  11. Top-down particle filtering for Bayesian decision trees. In International Conference on Machine Learning, pages 280–288. PMLR, 2013.
  12. A. R. Linero and Y. Yang. Bayesian regression tree ensembles that adapt to smoothness and sparsity. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 80(5):1087–1110, 2018.
  13. Marginal likelihood computation for model selection and hypothesis testing: An extensive review. SIAM Review, 65(1):3–58, 2023.
  14. Equation of State Calculations by Fast Computing Machines. The Journal of Chemical Physics, 21(6):1087–1092, 1953.
  15. R. M. Neal et al. MCMC using Hamiltonian dynamics. Handbook of Markov Chain Monte Carlo, 2(11), 2011.
  16. M. T. Pratola et al. Efficient Metropolis–Hastings Proposal Mechanisms for Bayesian Regression Tree Models. Bayesian Analysis, 11(3):885–911, 2016.
  17. J. R. Quinlan. Induction of decision trees. Machine Learning, 1:81–106, 1986.
  18. J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
  19. Inference trees: Adaptive inference with exploration. arXiv preprint arXiv:1806.09550, 2018.
  20. Stan Development Team. Stan Modeling Language Users Guide and Reference Manual Version 2.29. https://mc-stan.org, 2019.
  21. Dynamic trees for learning and design. Journal of the American Statistical Association, 106(493):109–123, 2011.
  22. Bayesian CART: Prior Specification and Posterior Simulation. Journal of Computational and Graphical Statistics, 16(1):44–66, 2007.
  23. Divide, conquer, and combine: a new inference strategy for probabilistic programs with stochastic support. In International Conference on Machine Learning, pages 11534–11545. PMLR, 2020.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets