Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhancing Formal Theorem Proving: A Comprehensive Dataset for Training AI Models on Coq Code (2403.12627v2)

Published 19 Mar 2024 in cs.AI and cs.LO

Abstract: In the realm of formal theorem proving, the Coq proof assistant stands out for its rigorous approach to verifying mathematical assertions and software correctness. Despite the advances in artificial intelligence and machine learning, the specialized nature of Coq syntax and semantics poses unique challenges for LLMs. Addressing this gap, we present a comprehensive dataset specifically designed to enhance LLMs' proficiency in interpreting and generating Coq code. This dataset, derived from a collection of over 10,000 Coq source files, encompasses a wide array of propositions, proofs, and definitions, enriched with metadata including source references and licensing information. Our primary aim is to facilitate the development of LLMs capable of generating syntactically correct and semantically meaningful Coq constructs, thereby advancing the frontier of automated theorem proving. Initial experiments with this dataset have showcased its significant potential; models trained on this data exhibited enhanced accuracy in Coq code generation. Notably, a particular experiment revealed that a fine-tuned LLM was capable of generating 141 valid proofs for a basic lemma, highlighting the dataset's utility in facilitating the discovery of diverse and valid proof strategies. This paper discusses the dataset's composition, the methodology behind its creation, and the implications of our findings for the future of machine learning in formal verification. The dataset is accessible for further research and exploration: https://huggingface.co/datasets/florath/coq-facts-props-proofs-gen0-v1

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. Andreas Florath “LLM Interactive Optimization of Open Source Python Libraries – Case Studies and Generalization”, 2024 arXiv:2312.14949 [cs.SE]
  2. The Coq Development Team “The Coq Proof Assistant” accessed 2024-02-29 URL: https://coq.inria.fr/
  3. “Lean” accessed 2024-03-18 URL: https://lean-lang.org
  4. “Isabelle” accessed 2024-03-18 URL: https://isabelle.in.tum.de
  5. “StarCoder 2 and The Stack v2: The Next Generation”, 2024 arXiv:2402.19173 [cs.SE]
  6. “Huggingface Datasets” accessed 2024-03-01 URL: https://huggingface.co/datasets
  7. “Huggingface Dataset: coq-github-scrape” accessed 2024-02-27 URL: https://huggingface.co/datasets/cassanof/coq-github-scrape
  8. “Huggingface Dataset: coq-train” accessed 2024-02-27 URL: https://huggingface.co/datasets/metareflection/coq-train
  9. “Learning to Prove Theorems via Interacting with Proof Assistants” In International Conference on Machine Learning (ICML), 2019
  10. “CC BY 2.0 LEGAL CODE Attribution 2.0 Generic” accessed 2024-03-01 URL: https://creativecommons.org/licenses/by/2.0/legalcode.en
  11. “ShareAlike compatibility: GPLv3” accessed 2024-03-01 URL: https://wiki.creativecommons.org/wiki/ShareAlike_compatibility:_GPLv3
  12. “License Compatibility Review Suggested for Dataset” accessed 2024-03-18 URL: https://github.com/princeton-vl/CoqGym/issues/87
  13. “Dataset jbb/coq_code” accessed 2024-03-01 URL: https://huggingface.co/datasets/jbb/coq_code
  14. “Deep Generation of Coq Lemma Names Using Elaborated Terms” In International Joint Conference on Automated Reasoning, 2020, pp. 97–118 DOI: 10.1007/978-3-030-51054-1˙6
  15. “MathComp Corpus” accessed 2024-03-08 URL: https://github.com/EngineeringSoftware/math-comp-corpus
  16. “Learning to Format Coq Code Using Language Models”, 2020 arXiv:2006.16743 [cs.HC]
  17. “Kaggle datasets” accessed 2024-03-01 URL: https://www.kaggle.com/datasets
  18. “Coq” accessed 2024-03-01 URL: https://github.com/coq/coq
  19. “Mathematical Components” accessed 2024-03-01 URL: https://github.com/math-comp
  20. “coq-ext-lib” accessed 2024-03-01 URL: https://github.com/coq-community/coq-ext-lib.git
  21. “GeoCoq” accessed 2024-03-01 URL: https://github.com/GeoCoq/GeoCoq
  22. “The Four Color Theorem” accessed 2024-03-01 URL: https://github.com/coq-community/fourcolor.git
  23. “algebra-tactics” accessed 2024-03-01 URL: https://github.com/math-comp/algebra-tactics.git
  24. “coqprime” accessed 2024-03-01 URL: https://github.com/thery/coqprime
  25. “100 famous theorems proved using Coq” accessed 2024-03-01 URL: https://github.com/coq-community/coq-100-theorems.git
  26. “verdi” accessed 2024-03-01 URL: https://github.com/uwplse/verdi
  27. “stdpp” accessed 2024-03-07 URL: https://gitlab.mpi-sws.org/iris/stdpp.git
  28. “Coq Facts, Propositions and Proofs” accessed 2024-03-18 URL: https://huggingface.co/datasets/florath/coq-facts-props-proofs-gen0-v1
  29. “Mistral 7B”, 2023 arXiv:2310.06825 [cs.CL]
  30. “CoqLLM-FineTuned-Experiment-Gen0” accessed 2024-03-18 URL: https://huggingface.co/florath/CoqLLM-FineTuned-Experiment-Gen0
  31. “Gemini: A Family of Highly Capable Multimodal Models”, 2023 arXiv:2312.11805 [cs.CL]
  32. “GPT-4 Technical Report”, 2024 arXiv:2303.08774 [cs.CL]
  33. “Stylish Article” Accessed: 2023-11-01, https://www.latextemplates.com/template/stylish-article

Summary

We haven't generated a summary for this paper yet.