Quantify the magnitude of knowledge-dependent overfitting on ARC-AGI
Quantify the magnitude of the contribution of knowledge-dependent benchmark overfitting to model performance on ARC-AGI-1 and ARC-AGI-2, to separate genuine generalization from effects due to pretraining exposure and related data leakage.
References
Although we assess that this new form of ``overfitting'' assists models in solving ARC, we cannot precisely quantify the magnitude of this effect.
— ARC Prize 2025: Technical Report
(2601.10904 - Chollet et al., 15 Jan 2026) in Section: Characterizing AGI through continual benchmark adaptation