LeanGraph: Formal Math Dependency Graph
- LeanGraph is a Lean 4 plugin that creates precise dependency graphs of formal mathematical declarations using a six-role edge schema.
- It integrates with the Lean elaboration pipeline to capture user-facing declarations and omits kernel-generated artifacts for clear analysis.
- Its large-scale output, validated on Mathlib, supports advanced applications in mathematical search, attribution, and retrieval-augmented reasoning.
LeanGraph is a high-precision, elaborator-level extractor for formal mathematics in Lean 4, introduced as a core component of the TheoremGraph initiative to bridge formal and informal mathematical knowledge at the statement level. Designed as a Lean 4 plugin, LeanGraph systematically produces a large-scale dependency graph over Lean declarations with typed edges that capture a wide spectrum of logical and structural relationships between mathematical objects. Its output serves as foundational infrastructure for mathematical search, attribution, and retrieval-augmented reasoning (Kurgan et al., 24 Jun 2026).
1. Extraction Architecture and Workflow
LeanGraph integrates with the Lean 4 environment by hooking into the kernel’s elaboration/type-checking pipeline through the Environment API. The main stages are as follows:
- Project Compilation: A Lean project, comprising
.leansource files and alakefile.lean, is compiled by Lean/Lake, constructing an in-memoryEnvironmentencompassing all constants, theorems, definitions, structures, and instances. - Post-Elaboration Traversal: After type-checking, LeanGraph traverses
Environment.constants. For each constant , it collects signature (ConstantInfo.type), value/proof (ConstantInfo.value?), and docstring (ConstantInfo.docString?). The inclusion predicate
discards kernel-generated artifacts such as recursors and matchers.
- Edge Extraction: For each included constant :
- Signature references: Extracted as
sig-type edges. - Body/Proof references: Labeled as
defif computational,proofif in the proof of aProp. - Structure inheritance/fields: Edges labeled
extendsandfield, respectively. - Doc-string backtick references: Labeled as
docref.
- Signature references: Extracted as
- Output Serialization: Nodes (with metadata) and edges (source, target, role) are stored as JSON. Integration with databases (e.g.,
pgvector+ Postgres) is supported. A REST API and MCP client are offered for programmatic access.
Data-flow can be summarized as: 6
2. Formal Graph Model
LeanGraph defines a typed, directed multigraph structure:
where
- is the set of user-facing Lean constant declarations included according to the filter above.
- The finite edge role set encodes semantic dependency types:
- extends: structure/class inheritance
- field: structure field type reference
- sig: used in type/signature
- proof: appears in proof term (if target is in
Prop) - def: appears in computational body
- docref: docstring backtick mention
- Edge set consists of typed dependency triples : depends on in role 0.
- Node metadata: Each 1 is annotated with (name, kind, file, line, universe levels, modifiers, etc.); edges may include surface-syntax locations.
This schema captures type- and proof-theoretic relationships with fine granularity, enabling precise dependency tracing and graph-based retrieval.
3. Scale, Coverage, and Extraction Performance
The formal graph constructed by LeanGraph spans 25 open-source Lean 4 projects, most notably including Mathlib 4.27–4.29. Empirical statistics:
| Metric | All Projects | Mathlib 4.27–4.29 |
|---|---|---|
| Nodes (2) | 388,105 | 351,397 |
| Edges (3) | 11,335,708 | 9,333,251 |
| Avg. out-degree (edge density) | - | 4 |
| Cross-library edges | 1,750,198 | - |
Extraction times and resource footprints are not provided in the paper; internal runs indicate that extraction on Mathlib 4.29 takes 590 seconds of compile time and peaks at about 3 GB RAM on a 16 GB system.
A plausible implication is that the elaborator-level approach yields a denser and cleaner formal graph than source-parser pipelines, which often miss deep dependencies present only after elaboration.
4. Examples of Declaration and Edge Extraction
To illustrate LeanGraph’s extraction and labeling conventions, consider the following snippets:
Example 1: Theorem extraction and edges
7
Node output: 8
Edge output: 9
Example 2: Structure definition and role assignment
0 Edges: 1
This approach allows precise attribution of semantic roles to each edge for downstream filtering.
5. API and Programmatic Access
Access to LeanGraph data is facilitated through a REST HTTP API at https://api.theoremsearch.com/v1/leangraph and via an MCP (multi-context proof) Lean interface. Key API methods include:
- List declarations:
[GET](https://www.emergentmind.com/topics/gaussian-equivalence-theory-get) /declarations?library=mathlib&limit=50&offset=0
- Get a single declaration:
GET /declarations/MulRightInv.mul_right_inv
- List outgoing edges:
GET /edges?source=MulRightInv.mul_right_inv&role=proof
- Query subgraphs:
POST /mcp/query with parameters (node, hops, roles)
- In-Lean MCP client:
2
This design supports external clients and metaprogramming inside Lean for dependency exploration, visualization, and automated reasoning applications.
6. Evaluation and Comparative Effectiveness
LeanGraph’s efficacy was assessed on formal concept retrieval tasks using the MathlibQR fair-810 benchmark. Comparative results:
| Approach | Recall@10 | nDCG@10 | Notes |
|---|---|---|---|
| LeanSearch v2 + reranker | 0.780 | 0.623 | LM-based rerank |
| LeanGraph (embedding+graph) | 0.775 | 0.548 | No LM reranker |
LeanGraph’s “recall-optimized” configuration—incorporating slogan and name/signature embeddings, small lexical index, HyDE query rewriting, wider ANN shortlist, and one-hop graph expansion—achieves Recall@10 within 0.5pp of LeanSearch v2, without using LLM reranking.
Key conclusions:
- The type-aware, role-labeled extraction yields a denser and less noisy dependency graph than parser-based or Python-extractor pipelines (e.g., lean4export).
- Elaboration-level access minimizes false negatives in dependency recovery, especially for inter-class inheritance and fully elaborated proof terms.
- The six role labels afford granular control of graph traversal and filtering for downstream consumers, such as isolating signature-only subgraphs or proof-focused analyses.
7. Applications and Significance
LeanGraph provides essential infrastructure for numerous formal mathematics and proof engineering applications:
- Statement-level search and retrieval: Enabling fast, precise queries over 388k+ formal statements with detailed dependency expansion.
- Attribution and provenance tracing: Fine-grained edges link theorems, definitions, and structures, facilitating automated attribution mapping.
- Cross-library and cross-format bridging: By defining a unified edge schema and supporting programmatic lookup, LeanGraph underpins efforts to unify formal and informal mathematical corpora (cf. TheoremGraph, theoremsearch.com).
- Retrieval-augmented reasoning: Dense graph structure allows efficient subgraph extraction for retrieval-augmented generation, theorem proving, and machine learning pipelines.
A plausible implication is that the formal rigor and semantic richness of LeanGraph make it a superior substrate for both low-level proof tooling and high-level automated reasoning systems, facilitating advances in formal mathematics engineering and mathematical AI (Kurgan et al., 24 Jun 2026).