Traceability of GPT-3 Training Data for Affordance Examples
Ascertain whether the GPT-3 training corpus included examples from Arthur Glenberg’s earlier latent semantic analysis affordance research that were reused in the Glenberg and Jones probes, in order to evaluate potential test contamination.
References
Finally, there were some concerns that the earlier research by Glenberg (and thus the examples used in the current study) had been included in the training data for GPT-3. Jones said that there was no way to know this, but that AI performance on other tests had been shown to suffer when examples of those tests were removed from training data.
— Embodied, Situated, and Grounded Intelligence: Implications for AI
(2210.13589 - Millhouse et al., 2022) in Discussion — “Language Comprehension Requires Affordances” (Glenberg & Jones)