Papers
Topics
Authors
Recent
2000 character limit reached

SITA: Structure-to-Instance Autoformalization

Updated 16 November 2025
  • The paper introduces SITA, an automated framework that transforms abstract mathematical structures into concrete Lean instances via template-based formalization.
  • SITA leverages LLM-generated code skeletons and iterative error-fix procedures to ensure both syntactic correctness and semantic faithfulness in formal proofs.
  • By integrating Lean’s typeclass mechanism and structural fidelity as seen in ProofFlow and SITA-R1, the framework enhances throughput and accuracy in formalized theorem instantiation.

Structure-to-Instance Theorem Autoformalization (SITA) is an automated framework designed for rigorously formalizing the instantiation of abstract mathematical theories in concrete settings, specifically within the Lean proof assistant. SITA transforms template-like formal modules—consisting of definitions, assumptions, operations, and theorems—into verified formalizations of problem-specific instances, capitalizing on both LLMs and feedback-rich refinement procedures to achieve syntactic and semantic correctness. Different instantiations of SITA have emerged, notably with ProofFlow for structural fidelity in stepwise proof autoformalization (Cabral et al., 13 Oct 2025) and the generic SITA-R1 pipeline for algorithmic theorem instantiation in optimization (Li et al., 13 Nov 2025).

1. Formalization of Abstract Structures as Templates

Mathematical theories often package “structure” as modular templates that can be reused and instantiated. In SITA, an abstract structure is encoded as a four-tuple: S=D,  O,  C,  T\mathcal S = \langle \mathcal D,\;\mathcal O,\;\mathcal C,\;\mathcal T\rangle where:

  • D\mathcal D (“Definitions”): primitive objects and axioms,
  • O\mathcal O (“Operations”): algorithms or maps built on D\mathcal D,
  • C\mathcal C (“Conditions”): assumptions (e.g. convexity, Lipschitz regularity),
  • T\mathcal T (“Theorems”): propositions derived under C\mathcal C.

In Lean, these are encoded as type classes. For example, a gradient descent convergence theorem on a composite objective is implemented as:

1
2
3
class composite_pro (f h : E → ℝ)
class pg (pro : composite_pro f h) (x₀ : E)
theorem pg_converge ... := by sorry
D\mathcal D is processed via composite_pro, O\mathcal O through pg, while T\mathcal T resides in pg_converge, with the assumptions C\mathcal C encoded as theorem hypotheses.

2. Instance Generation and Typeclass Integration

SITA operationalizes the instantiation process by prompting an LLM to output Lean definitions for the concrete problem, its parameterization, instance declarations linking it to the abstract template, and all algorithmic constructs. For example, instantiating to Lasso yields:

1
2
3
4
5
class Lasso_pro (A : Matrix ...) (b : Fin m → ℝ) (μ : ℝ) ...
def Lasso_pro.f ... -- squared error
def Lasso_pro.g ... -- %%%%10%%%% term
instance (pro : Lasso_pro ...) : composite_pro pro.f pro.g := {}
instance (alg : pg_Lasso pro x₀) : pg ... := ...
Lean’s typeclass mechanism ensures that generic theorems (like pg_converge) are applicable to these instances provided all conditions are discharged. Verified instantiation requires proof-of-assumption lemmas for each side condition, such as convexity and differentiability:

1
2
lemma Lasso_pro.ConvexOn_f (pro : Lasso_pro ...) : ConvexOn ℝ univ pro.f := ...
lemma Lasso_pro.Lipschitz_f ...

3. LLM-Based Autoformalization and Feedback Refinement

The SITA pipeline proceeds through structured stages:

  • Skeleton Construction: LLM generates outline code from the template and instance description.
  • Error-Fix and Proof Refinement: Deterministic syntax corrections, cache-driven error repair (using an error-message–fix knowledge base K\mathcal K), and iterative proof synthesis for goals with sorry, cycling until type-checking and correctness are achieved.
  • Postprocessing: Remaining sorry are replaced with placeholders or aligned with minimal well-typed stubs, and natural-language back-translation is produced for documentation.

High-level pseudocode for the workflow:

1
2
3
4
5
6
Generate skeleton via LLM
Lean.check
Apply ErrorFix
For each sorry: LLMProof, Lean.check_proof (loop with retry)
Final type-check
Return file

4. Structure-Preserving Proof Mapping: ProofFlow and Beyond

The SITA philosophy is also realized in ProofFlow (Cabral et al., 13 Oct 2025), which emphasizes structural fidelity in stepwise proof autoformalization. It models the proof as a directed acyclic graph (DAG)

G=(V,E)G = (V, E)

where nodes VV consist of theorem conditions (VTC)(V_{TC}), definitions (VD)(V_D), intermediate lemmas (VL)(V_L), and theorem solutions (VTS)(V_{TS}), and edges EE encode logical dependencies. Each natural language step SiS_i maps to an intermediate lemma LiL_i with dependencies DiD_i: f:{Si}i=1n{Li}i=1nf: \{S_i\}_{i=1}^n \to \{L_i\}_{i=1}^n with enforced self-containment and faithfulness, realized through targeted LLM prompting and iterative type-checking. The structure ensures each lemma’s context is minimal yet sufficient, reducing search complexity and preventing shortcut tactics.

5. Evaluation Metrics and Benchmarking

Outcomes of SITA autoformalization are measured by multi-axis metrics. Notably in ProofFlow:

  • Syntactic correctness cic_i: binary compilation success per node,
  • Semantic faithfulness fi[0,1]f_i \in [0,1]: LLM-judged preservation of meaning,
  • Structural fidelity sis_i: correctness of dependency structure,

with a composite metric: ProofScore=1ni=1nficisi\text{ProofScore} = \frac{1}{n} \sum_{i=1}^{n} f_i\,c_i\,s_i which rewards full alignment of all three criteria. In (Cabral et al., 13 Oct 2025), ProofFlow achieves ProofScore=0.545\text{ProofScore} = 0.545—substantially improving on full-proof (0.123) and step-proof (0.072) baselines. In SITA-R1 (Li et al., 13 Nov 2025), rates of file-level success (57.14%) and definition/theorem/instance correctness (>93%) materially surpass direct generation, with ablation studies confirming the necessity of error repair and iterative refinement.

System ProofScore File-level Success (%) Def/Thm/Instance (%)
ProofFlow DAG 0.545 37.5 93.9 / N/A / N/A
ProofFlow noDAG 0.417 35.3 N/A
Full-Proof 0.123 14.1 N/A
SITA-R1 N/A 57.14 93.8/95.6/95.4

A plausible implication is that structural decomposition and template-based generation offer critical advantages for both correctness and coverage.

6. Limitations, Challenges, and Prospects

Current SITA implementations face several technical barriers:

  • Complex type coercions: LLMs struggle with advanced Lean type translations (e.g., matrices to ContinuousLinearMap).
  • Hard side-conditions: Properties like the Kurdyka–Łojasiewicz inequality require symbolic reasoning typically infeasible for end-to-end generation.
  • Scaling with proof complexity: Large DAGs or templates with intricate sublemma hierarchies tax prompting and topological reasoning.
  • Semantic slips: Over 35% of step failures in ProofFlow are due to misinterpretation of dependencies or missed assumptions.

Potential avenues for strengthening SITA include reinforcement learning for semantic preservation, integrated tactic/proof search, human-in-the-loop disambiguation, broader template libraries, and expansion to domains beyond convex optimization (e.g., graph theory, algebraic structures). Extending the error knowledge base through continual learning and augmenting with domain-specific heuristics is anticipated to advance full-file success rates above 80%.

7. Significance and Comparative Context

SITA distinguishes itself from tactic-first and linear step-proof strategies by maintaining both minimal and precise context per lemma, systematically enforcing structural fidelity. Template instantiation enables high-throughput derivation of verified concrete theorems, reducing manual intervention in large families of problems—a prevalent task in both mathematical research and formal-methods software engineering. SITA’s pipeline demonstrates scalable integration of LLM automation with interactive theorem proving, setting a reference model for future automated formalization frameworks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Structure-to-Instance Theorem Autoformalization (SITA).