Papers
Topics
Authors
Recent
2000 character limit reached

VerMCTS: Synthesizing Multi-Step Programs using a Verifier, a Large Language Model, and Tree Search (2402.08147v2)

Published 13 Feb 2024 in cs.SE, cs.AI, cs.LG, cs.LO, and cs.PL

Abstract: LLMs can generate useful code, but often the code they generate cannot be trusted to be sound. In this paper, we present VerMCTS, an approach to begin to resolve this issue by generating verified programs in Dafny and Coq. VerMCTS uses a logical verifier in concert with an LLM to guide a modified Monte Carlo Tree Search (MCTS). This approach leverages the verifier to gain intermediate feedback inside the search algorithm by checking partial programs at each step to estimate an upper bound on the value function. To measure the performance of VerMCTS, we develop a new suite of multi-step verified programming problems in Dafny and Coq. In terms of pass@T, a new metric which computes the pass rate given a budget of T tokens sampled from the LLM, VerMCTS leads to more than a 30% absolute increase in average pass@5000 across the suite over repeated sampling from the base LLM. Our code and benchmarks are available at https://github.com/namin/LLM-verified-with-monte-carlo-tree-search .

Citations (1)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 7 tweets with 12 likes about this paper.