Precise Form of Intermediate Reasoning for Formal Verification

Determine the precise structure and representation of intermediate reasoning that large language models should produce to enable automated formal verification of program specifications in the Dafny language, identifying what intermediate artifacts (beyond natural-language chain-of-thought) make verification feasible with Dafny’s Z3-backed verifier.

Background

The paper minimizes human priors and removes natural-language chain-of-thought from its pipeline, relying instead on Dafny’s automated verifier to provide reinforcement learning signals. While transformers with chain-of-thought can simulate a universal Turing machine, the authors note that the exact form of intermediate reasoning needed for formal verification is not established.

This uncertainty motivates their decision to avoid natural-language reasoning traces and to focus on generating complete Dafny programs with specifications. Clarifying the required intermediate reasoning form would guide the design of models and training regimes that can systematically produce verifiable specifications and proofs within Dafny.

References

Although Transformer models augmented with CoT have proven to simulate a universal Turing machine \citep{Schuurmans2024}, which lays the foundation for code emulation with LLMs, the precise form of intermediate reasoning required for formal verification remains an open question.

— Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny (2507.16331 - Yan et al., 22 Jul 2025) in Section 2 (Pipeline)

Precise Form of Intermediate Reasoning for Formal Verification

Sponsor

Background

References

Related Problems