Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning to Prove Theorems via Interacting with Proof Assistants (1905.09381v1)

Published 21 May 2019 in cs.LO, cs.AI, cs.LG, and stat.ML

Abstract: Humans prove theorems by relying on substantial high-level reasoning and problem-specific insights. Proof assistants offer a formalism that resembles human mathematical reasoning, representing theorems in higher-order logic and proofs as high-level tactics. However, human experts have to construct proofs manually by entering tactics into the proof assistant. In this paper, we study the problem of using machine learning to automate the interaction with proof assistants. We construct CoqGym, a large-scale dataset and learning environment containing 71K human-written proofs from 123 projects developed with the Coq proof assistant. We develop ASTactic, a deep learning-based model that generates tactics as programs in the form of abstract syntax trees (ASTs). Experiments show that ASTactic trained on CoqGym can generate effective tactics and can be used to prove new theorems not previously provable by automated methods. Code is available at https://github.com/princeton-vl/CoqGym.

Citations (124)

Summary

  • The paper introduces the CoqGym dataset, comprising 71,000 human-written proofs from diverse domains, to train models for automated theorem proving.
  • It presents ASTactic, a deep learning model using TreeLSTM and GRU that generates tactics as abstract syntax trees, enabling novel tactic creation.
  • Experimental results show ASTactic achieves a 12.2% success rate, rising to 30% when combined with ATP systems, highlighting its potential for automated proofs.

An Academic Overview of "Learning to Prove Theorems via Interacting with Proof Assistants"

This paper tackles the challenge of automating theorem proving through interaction with proof assistants, focusing specifically on Coq. The authors present a novel approach through the creation of CoqGym, a comprehensive dataset tailored for deep learning models to generate proof tactics effectively. They introduce ASTactic, a deep learning model capable of generating tactics in the form of abstract syntax trees (ASTs), a more flexible and adaptive method compared to existing techniques.

Key Contributions

  1. CoqGym Dataset: The authors propose CoqGym, a large dataset comprising 71,000 human-written proofs sourced from 123 open-source software projects. This dataset spans various domains such as mathematics and programming languages, offering a broad scope for training machine learning models. The scale and diversity are significant improvements over prior datasets, enhancing the potential for cross-domain generalization.
  2. ASTactic Model: ASTactic is developed to generate tactics dynamically as ASTs using a predefined grammar. This model contrasts with previous approaches that relied on fixed sets of tactics, thereby enabling the generation of novel tactics unseen during training. ASTactic makes use of a TreeLSTM network to encode the goal and premises and a GRU-based decoder to generate tactics, effectively incorporating semantic constraints to ensure validity within Coq's environment.

Experimental Results

The experimental findings reveal that ASTactic outperforms Coq's built-in automated tactics, achieving a success rate of 12.2% on proving theorems from the test set. This performance is noteworthy given the model's automatic generation capabilities and its ability to prove new theorems not previously solvable by other methods. When combined with state-of-the-art automated theorem proving (ATP) systems like hammer, the success rate increases to 30%, underscoring the model's effectiveness in this domain.

Implications and Future Directions

The implications of this research are manifold, both in theory and practice. The ability to automate theorem proving more effectively could enhance formal mathematics, software verification, and hardware design by reducing the reliance on manual proof construction. From a theoretical standpoint, the approach demonstrates the potential of leveraging deep learning models to tackle problems in interactive theorem proving (ITP), offering new insights into automated reasoning tasks.

Future work could explore further scaling of CoqGym, extending ASTactic's ability to generate even more complex tactics and improving the model's generalization to untapped domains. Additionally, integrating reinforcement learning methods could refine ASTactic's interactive capabilities, potentially increasing the success rates and efficiency further.

Conclusion

The paper presents a significant advance in automating theorem proving through the novel application of machine learning techniques. By leveraging the CoqGym dataset and developing ASTactic, the authors provide a robust framework for generating effective proof tactics. The approach holds promise for advancing the automation of complex proving tasks and sets the groundwork for future exploration in AI-driven theorem proving.

Github Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com