Learning-Assisted Automated Reasoning with Flyspeck (1211.7012v3)

Published 29 Nov 2012 in cs.AI, cs.DL, cs.LG, and cs.LO

Abstract: The considerable mathematical knowledge encoded by the Flyspeck project is combined with external automated theorem provers (ATPs) and machine-learning premise selection methods trained on the proofs, producing an AI system capable of answering a wide range of mathematical queries automatically. The performance of this architecture is evaluated in a bootstrapping scenario emulating the development of Flyspeck from axioms to the last theorem, each time using only the previous theorems and proofs. It is shown that 39% of the 14185 theorems could be proved in a push-button mode (without any high-level advice and user interaction) in 30 seconds of real time on a fourteen-CPU workstation. The necessary work involves: (i) an implementation of sound translations of the HOL Light logic to ATP formalisms: untyped first-order, polymorphic typed first-order, and typed higher-order, (ii) export of the dependency information from HOL Light and ATP proofs for the machine learners, and (iii) choice of suitable representations and methods for learning from previous proofs, and their integration as advisors with HOL Light. This work is described and discussed here, and an initial analysis of the body of proofs that were found fully automatically is provided.

Citations (161)

View on Semantic Scholar

Summary

The paper demonstrates a merged system using ATPs and machine learning that autonomously proves 39% of Flyspeck theorems.
It employs logic translations from HOL Light to various ATP-friendly formats and evaluates multiple provers including Vampire, E, and Z3.
The system’s design reveals potential for AI-assisted theorem proving to formalize complex proofs while reducing human effort.

An Evaluation of Learning-Assisted Automated Reasoning Over the Flyspeck Project

The research described in the given paper stands as a significant contribution to the synergy between automated theorem proving (ATP) and machine learning, particularly within large-theorem datasets like the Flyspeck project. This paper demonstrates the integration of various logic translation methodologies, machine learning for premise selection, and theorem proving systems to effectively automate the process of theorem proving in the context of the Flyspeck library. The Flyspeck project itself aims to formally validate the Kepler Conjecture, a substantial effort originally completed by mathematician Thomas Hales.

System Architecture and Processes

The paper details a system architecture that integrates external ATPs with machine-learning methods for premise selection, taking advantage of the extensive mathematical data provided by the Flyspeck project. It describes a workflow where the HOL Light logic is first translated into several ATP-friendly formats, namely untyped first-order, polymorphic typed first-order, and typed higher-order logic. From these translations, ATP problems are generated and solved using a variety of specific ATPs such as Vampire, E, and Z3.

Central to the research is the use of machine learning for premise selection. Training data is constructed from mappings between conjectures and sets of relevant premises derived from prior theorems, automated proofs, and even failed proof attempts. Using a chronological machine learning approach, premise selectors are trained on solving previously proven theorems. When faced with new conjectures, these selectors rank available premises by relevance before ATP attempts automatic proofs.

Key Findings and Performance Metrics

The empirical evaluation within the paper showcases that the presented architecture can autonomously prove around 39% of the Flyspeck theorems in a fully automated, push-button scenario. This success rate reflects the potential of combining ATPs with machine-learned premise selection to tackle vast theories. Machine learning methods, like the naive Bayes and $k$ -nearest neighbor, were explored in varying configurations to optimize premise selection and increase proof success rates.

The paper's assessment revealed that most autonomous proofs found concise solutions as opposed to the original, more human-friendly proofs that often included broader contexts and redundancies. This indicates that the ATP+ML system not only mimics human mathematical reasoning processes to some extent but can also synthesize more computationally efficient proof paths.

Implications and Prospects for Future AI Developments

This work has profound implications both for the advancement of AI in automated reasoning and for practical applications in mathematical and formal verification disciplines. The methods developed demonstrate a pathway towards reducing the burden on human mathematicians when formally proving complex theorems over large databases. They indicate a future where AI could actively assist in formalizing definitions, theorems, and proofs, perhaps becoming integral to mathematical discovery and verification.

Looking forward, this paper's methodologies lay the groundwork for further enhancements in AI-assisted theorem proving. These include developing more advanced machine-learning algorithms suited for premise selection, enhancing logic translations, and integrating results back into ITP ecosystems like HOL Light. The corpora-driven training invoked in the paper point towards a new dimension in AI research where vast databases of proofs and mathematical data propel the development of intelligent systems capable of collaborating with human expertise.

PDF Markdown