Ascertain whether GPT-4’s pretraining data includes coq-wigderson
Ascertain whether the OpenAI GPT-4 pretraining dataset includes the coq-wigderson repository ("Towards the Formal Verification of Wigderson’s Algorithm"), which was used to evaluate the Cobblestone proof-synthesis approach and whose first commit to GitHub dates to March 2022—after GPT-4’s publicly stated pretraining cutoff in September 2021.
References
Still, we cannot know for certain that coq-wigderson is not in the GPT-4 pretraining data.
— Cobblestone: Iterative Automation for Formal Verification
(2410.19940 - Kasibatla et al., 2024) in Threats to Validity, Section 4.6 (near end of paper)