Dice Question Streamline Icon: https://streamlinehq.com

Direct measurement of semantic alignment between natural-language and Lean theorems

Develop a method to directly measure semantic alignment between natural-language mathematical theorems and their Lean 4 formal counterparts, establishing an automated evaluation criterion that determines whether a Lean 4 theorem faithfully represents the meaning of its originating natural-language theorem without relying on indirect proxies.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper introduces two evaluation criteria for proof auto-formalization: type correctness (Lean type-checking) and semantic correctness (equivalence between generated and reference Lean theorems). Because a direct comparison of the natural-language theorem to the Lean theorem is difficult, the authors reduce semantic evaluation to proving a biconditional between the generated Lean theorem and a gold-standard reference Lean theorem, assuming the latter faithfully formalizes the original natural-language statement.

They note that directly measuring the semantic alignment between a natural-language theorem and its Lean formalization remains unsolved, motivating their pragmatic approach of checking definitional and bounded propositional equality in Lean via restricted tactics. A direct, automated measure of NL–Lean semantic equivalence would remove reliance on such reductions and provide more principled evaluation.

References

While directly measuring the semantic alignment between a NL theorem and a Lean theorem is an unsolved challenge, showing the logical equivalence of two Lean theorems is a tractable task.

ProofBridge: Auto-Formalization of Natural Language Proofs in Lean via Joint Embeddings (2510.15681 - Jana et al., 17 Oct 2025) in Appendix, Section "Semantic Equivalence of Lean theorems"