Dice Question Streamline Icon: https://streamlinehq.com

Bias reduction for non-asymptotic inference on the true regression function using random forests

Develop bias-reduction methodology for the k-PNN random-forest estimator that enables construction of non-asymptotically valid confidence intervals for the true regression function value r0(x0), including suitable data-driven bias correction and variance estimation that align with the established Gaussian approximations.

Information Square Streamline Icon: https://streamlinehq.com

Background

The Gaussian approximation results in the paper imply non-asymptotic confidence intervals for the expected random-forest prediction, but not directly for the true regression function r0(x0) because the bias is non-negligible relative to the standard deviation.

The authors provide bias bounds that demonstrate this non-negligibility and note that effective bias reduction is crucial to obtain valid confidence intervals for r0(x0). While prior work has addressed bias in different random-forest variants (e.g., subsampling/bagging), a non-asymptotic bias-reduction approach compatible with their framework is still lacking.

References

"While some preliminary work is undertaken by with bootstrap in the context of bagging random forests, it remains an open problem how to reduce the bias to get non-asymptotically valid confidence intervals."

Multivariate Gaussian Approximation for Random Forest via Region-based Stabilization (2403.09960 - Shi et al., 15 Mar 2024) in Section 3.2 (Towards statistical inference)