LeanGeo: Formalizing Synthetic Geometry
- LeanGeo is a unified formal system for formalizing and solving competition-level synthetic geometry problems with a human-readable, axiomatic library.
- It integrates Lean 4’s proof infrastructure with tactics and external SMT solvers to automate and rigorously verify geometric proofs.
- The LeanGeo-Bench benchmark evaluates both human and AI-driven reasoning on a diverse set of formal geometry problems, fostering advancements in automated theorem proving.
LeanGeo is a unified formal system for expressing, verifying, and solving competition-level plane geometry problems within the Lean 4 theorem prover. It introduces a comprehensive, human-readable library of geometric theorems, high-level tactics for declarative proof construction, and an extensible benchmark (LeanGeo-Bench) for evaluating both human and AI-driven automated reasoning on geometry problems. By integrating with Lean’s logical infrastructure and Mathlib, LeanGeo supports rigorous proof verification and seamless interactions with other mathematical domains, establishing a formal bridge between synthetic geometry and modern automated reasoning.
1. Objectives and Core Architecture
LeanGeo is engineered to formalize and automate the solution of geometry problems that commonly appear in high-level mathematics competitions. Its key design aims are:
- To provide an extensible, axiomatic, and human-readable geometric theorem library.
- To encode formal geometric objects and relations independently of coordinates.
- To integrate with Lean 4’s proof infrastructure, leveraging external SMT solvers (notably CVC5) through LeanSMT while maintaining soundness via Lean 4’s foundational logic.
- To enable seamless use of analytical and algebraic tools from Mathlib, facilitating reasoning across multiple mathematical domains.
The proof language is declarative and tactic-driven. Key tactics include euclid_intros
, euclid_apply
, and euclid_finish
, each abstracting common geometric arguments into compositional, verifiable steps. The layered theorem library currently contains approximately 260 theorems, enabling the formalization of results ranging from introductory properties (midpoints, congruence, inscribed angles) up to advanced theorems such as Menelaus’ and Miquel’s theorems.
2. Formalization of Geometry Problems
LeanGeo employs an axiomatic approach inspired by both SystemE and LeanEuclid. This permits representations that abstract away from explicitly coordinate-dependent methods and mirror the practice of synthetic geometry.
Definitions are constructed both at the low-level (e.g., explicit construction of intersection points, congruence relations) and the high-level, using abbreviations to succinctly encode geometric properties. For instance, cyclicity is defined as:
1 2 |
abbrev Cyclic (A B C D : Point) : Prop := ∃ (O : Circle), A.onCircle O ∧ B.onCircle O ∧ C.onCircle O ∧ D.onCircle O |
Problem formalization follows a “bottom-up” methodology: lemmas and basic definitions serve as reusable proof atoms, layered to construct complete competition-level argument chains. The system handles complex configurations (triangles, circles, quadrilaterals) through modular proof strategies, and exhaustive case analyses—often demanded by visual dependency in diagrams—are managed by interfacing with SMT solvers, automating many diagram-based inferences.
3. The LeanGeo-Bench Benchmark
To facilitate rigorous evaluation and guide further advances in automated geometric reasoning, the project introduces LeanGeo-Bench—a benchmark suite of 122 formally encoded geometry problems. Key characteristics include:
Feature | Description | Example Sources |
---|---|---|
Problem diversity | Competition (IMO since 2000), synthetic, introductory, hand-written | International Mathematical Olympiad, Gemini-generated |
Formal encoding | All problems are specified in LeanGeo, with verified formal proofs | Coordinate-free statements |
Evaluation metrics | Measured using “pass@k” (success rate within k generations) | Used with LLMs like Gemini |
Baseline results with LLMs (e.g., Gemini 2.5 Pro, o4-mini) exhibit partial success but do not achieve resolution of the hardest problems, highlighting the benchmark’s rigor. LeanGeo-Bench is also used for LLM fine-tuning and reinforcement learning: fully correct proofs (kernel-verified) seed “activation” datasets; partially correct ones guide RL strategies.
4. Rigorous Proof Verification
One of LeanGeo’s distinguishing aspects is the demand for absolute rigor in geometric proof:
- All claims must be expressed in Lean’s formal syntax, ruling out appeals to implicit diagrammatic intuition.
- The
esmt
tactic transfers local hypotheses and relevant axioms into CVC5’s input, enabling quick discharge of “trivial” cases while preserving Lean-centric soundness. - The global dependency graph and parsed axiom caches optimize repetition-heavy subgoal verification, reducing computational redundancy.
- Proof tactics structure previously diagram-dependent reasoning as explicit sequences that Lean’s kernel can check, ensuring that all geometric inferences—no matter how visually motivated—are expressed as logically sound formal chains.
This pipeline eliminates human “sketch” ambiguity and is intended to scale to the expressiveness required by mathematical competitions.
5. Integration with Mathlib and Cross-Domain Reasoning
LeanGeo’s design leverages the full spectrum of Lean 4’s Mathlib, integrating algebraic, analytic, and inequality tools into geometric formalizations. This enables, for example, direct use of trigonometric identities to reason about geometric inequalities—illustrated in the formal proof of IMO 2001 Problem 1.
The system is inherently modular: each new geometric theorem added extends the framework for subsequent reasoning, supporting collaborative growth in the community and fostering research at the confluence of geometry, algebra, and other mathematical fields.
A plausible implication is that this unified design will support deeper integration of geometry with broader areas of formal mathematics, potentially facilitating new discoveries in automated theorem proving.
6. Evaluation, Automation, and Ongoing Challenges
Evaluations demonstrate that, although existing LLMs can synthesize partial or introductory proofs within LeanGeo, current models fail to solve higher-tier competition problems. This suggests that geometry, particularly in the formalized style required by LeanGeo, remains a stringent unresolved challenge for large models.
Areas targeted for advancement include:
- Automation: Integration of domain-specific heuristics such as the Area Method and algebraic geometry approaches is needed to increase proof search speed and coverage.
- Soundness: Ensuring complete reconstruction of SMT-generated certificates as native Lean proofs remains open for trustworthy end-to-end formalization.
- LLM Integration: Customizing prompt engineering and domain-specific encoding to compress context dependencies and library overhead, enabling LLMs to reason more effectively within the LeanGeo environment.
7. Openness and Community Contribution
LeanGeo’s theorem library and LeanGeo-Bench benchmark are fully open sourced at https://github.com/project-numina/LeanGeo/tree/master. The repository includes:
- Detailed definitions, abbreviations, and axioms capturing the full expressivity of synthetic geometry.
- Human-readable proofs and demonstration files.
- Guidelines for contributions, including mechanisms for adding new theorems, problem encodings, or integrating novel automation tactics.
- Tools for extending the benchmark with new problems, particularly from emerging competitions or interdisciplinary areas.
An outgrowth of this openness is that LeanGeo provides not only a robust foundation for formal geometry research but also a collaborative platform for the next generation of integrated neural-symbolic reasoning systems.
LeanGeo thus represents a substantial advance in the formalization and automation of synthetic geometry, connecting the rigor of theorem proving with the scalability and interoperability required for modern AI-driven mathematical reasoning (Song et al., 20 Aug 2025). Its open infrastructure, rigorous methodological demands, and integration with Lean and Mathlib situate it as a reference point for research in automated theorem proving, geometry education, and neuro-symbolic AI.