Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Online Learning of Decision Trees with Thompson Sampling (2404.06403v1)

Published 9 Apr 2024 in cs.LG

Abstract: Decision Trees are prominent prediction models for interpretable Machine Learning. They have been thoroughly researched, mostly in the batch setting with a fixed labelled dataset, leading to popular algorithms such as C4.5, ID3 and CART. Unfortunately, these methods are of heuristic nature, they rely on greedy splits offering no guarantees of global optimality and often leading to unnecessarily complex and hard-to-interpret Decision Trees. Recent breakthroughs addressed this suboptimality issue in the batch setting, but no such work has considered the online setting with data arriving in a stream. To this end, we devise a new Monte Carlo Tree Search algorithm, Thompson Sampling Decision Trees (TSDT), able to produce optimal Decision Trees in an online setting. We analyse our algorithm and prove its almost sure convergence to the optimal tree. Furthermore, we conduct extensive experiments to validate our findings empirically. The proposed TSDT outperforms existing algorithms on several benchmarks, all while presenting the practical advantage of being tailored to the online setting.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. Learning optimal decision trees using caching branch-and-bound search. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 3146–3153.
  2. Finite-time analysis of the multiarmed bandit problem. Machine learning, 47(2):235–256.
  3. Bayesian mixture modelling and inference based Thompson Sampling in Monte-Carlo tree search. Advances in neural information processing systems, 26.
  4. Bennett, K. P. (1994). Global tree optimization: A non-greedy decision tree algorithm. Computing Science and Statistics, pages 156–156.
  5. Optimal decision trees. Rensselaer Polytechnic Institute Math Report, 214:24.
  6. Optimal classification trees. Machine Learning, 106(7):1039–1082.
  7. Adaptive learning from evolving data streams. In International Symposium on Intelligent Data Analysis, pages 249–260. Springer.
  8. Data stream mining: a practical approach. The university of Waikato.
  9. Classification and regression trees. CRC press.
  10. A survey of Monte Carlo Tree Search methods. IEEE Transactions on Computational Intelligence and AI in games, 4(1):1–43.
  11. Clark, C. E. (1961). The greatest of a finite set of random variables. Operations Research, 9(2):145–162.
  12. Mining high-speed data streams. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 71–80.
  13. On the Convergence of Monte Carlo UCB for Random-Length Episodic MDPs. arXiv preprint arXiv:2209.02864.
  14. Optimal sparse decision trees. Advances in Neural Information Processing Systems, 32.
  15. Mining time-changing data streams. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 97–106.
  16. Efficient decision tree construction on streaming data. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 571–576.
  17. Bandit based Monte-Carlo planning. In European conference on machine learning, pages 282–293. Springer.
  18. Kschischang, F. R. (2017). The complementary error function. Online, April.
  19. Constructing optimal binary decision trees is NP-complete. Information processing letters, 5(1):15–17.
  20. Generalized and scalable optimal sparse decision trees. In International Conference on Machine Learning, pages 6150–6160. PMLR.
  21. Extremely fast decision tree. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1953–1962.
  22. Efficient non-greedy optimization of decision trees. Advances in neural information processing systems, 28.
  23. A Monte Carlo Tree Search approach to learning decision trees. In 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pages 429–435. IEEE.
  24. Quinlan, J. (2014). C4. 5: programs for machine learning. Elsevier.
  25. Quinlan, J. R. (1986). Induction of decision trees. Machine learning, 1(1):81–106.
  26. Decision trees for mining data streams based on the Gaussian approximation. IEEE Transactions on Knowledge and Data Engineering, 26(1):108–119.
  27. Decision trees for mining data streams based on the mcdiarmid’s bound. IEEE Transactions on Knowledge and Data Engineering, 25(6):1272–1279.
  28. Advances in computation of the maximum of a set of Gaussian random variables. IEEE Transactions on Computer-Aided design of integrated circuits and systems, 26(8):1522–1533.
  29. Reinforcement learning: An introduction. MIT press.
  30. Bayesian Inference in Monte-Carlo Tree Search. In Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, UAI’10, page 580–588, Arlington, Virginia, USA. AUAI Press.
  31. Learning optimal classification trees using a binary linear program formulation. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 1625–1632.
  32. On the convergence of the Monte Carlo exploring starts algorithm for reinforcement learning. arXiv preprint arXiv:2002.03585.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets