Limits of capability gains from random guessing and ensembling
Determine the extent to which post-training by sampling random Gaussian weight perturbations around pretrained weights, selecting top-performing perturbations by validation score, and ensembling their predictions via majority vote (RandOpt) can improve performance beyond the pretrained base model on downstream tasks; characterize whether these gains saturate as model size and the perturbation population size increase.
References
Our results leave open the question of exactly how far beyond the base model's abilities random guessing and ensembling can take us.
— Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights
(2603.12228 - Gan et al., 12 Mar 2026) in Limitations, paragraph "Capacity to Learn Dramatically New Skills?"