Confidence intervals for the random forest generalization error (2112.06101v3)

Published 11 Dec 2021 in stat.ML and cs.LG

Abstract: We show that the byproducts of the standard training process of a random forest yield not only the well known and almost computationally free out-of-bag point estimate of the model generalization error, but also give a direct path to compute confidence intervals for the generalization error which avoids processes of data splitting and model retraining. Besides the low computational cost involved in their construction, these confidence intervals are shown through simulations to have good coverage and appropriate shrinking rate of their width in terms of the training sample size.

Citations (9)

View on Semantic Scholar