Verify whether Jamrozik et al. (2020) supplementary materials were included in the model’s pretraining corpus
Ascertain whether the supplementary materials accompanying Jamrozik et al. (2020) that contain the legal pre-emption example were included in the pretraining corpus of the Gemini models evaluated in this study, to determine if the observed translation reflects reconstruction via pattern matching or retrieval of memorized text.
References
We cannot know for certain whether these materials were included in the model’s pretraining.
— The unreasonable effectiveness of pattern matching
(2601.11432 - Lupyan et al., 16 Jan 2026) in Section 2: From Jabberwocky to The Gostak; paragraph adjacent to the legal pre-emption example table (after the translation comparison)