Presence of FCI items in GPT-4o and o1 training data
Determine whether the Force Concept Inventory (FCI) assessment questions and the corresponding answer key are included in the training datasets of OpenAI’s GPT-4o-2024-11-20 and o1-2024-12-17 models used in this study, in order to clarify whether model performance could be influenced by prior exposure to the assessment materials.
References
Finally, the possibility that the FCI assessment questions and answer key were present in the model's training data cannot be fully excluded; however, the observed performance shifts in response to translation errors suggest that model outputs are still sensitive to semantic changes, not just memorized content.
— Translating the Force Concept Inventory in the age of AI
(2508.13908 - Babayeva et al., 19 Aug 2025) in Section 5 (Limitations, implications, and further questions)