Conjecture: Spiked-smooth behavior arises with overfitting in tree ensembles
Determine whether randomized ensembles of decision trees (such as random forests) exhibit spiked-smooth behavior—i.e., they use fewer effective parameters when predicting at previously unseen test inputs than at training inputs—whenever the individual trees are overfitted to the training data. Establish this phenomenon by assessing the gap between train-time and test-time effective parameters computed from the smoother-weight vectors of the ensemble.
References
We thus conjecture that spiked-smooth behavior appears whenever there is some degree of overfitting to the training data.
— Why do Random Forests Work? Understanding Tree Ensembles as Self-Regularizing Adaptive Smoothers
(2402.01502 - Curth et al., 2 Feb 2024) in Section 3.1.2 (Spiked-smooth behavior is not unique to interpolating models)