PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models (2109.05093v1)

Published 10 Sep 2021 in cs.CL and cs.PL

Abstract: Large pre-trained LLMs for textual data have an unconstrained output space; at each decoding step, they can produce any of 10,000s of sub-word tokens. When fine-tuned to target constrained formal languages like SQL, these models often generate invalid code, rendering it unusable. We propose PICARD (code and trained models available at https://github.com/ElementAI/picard), a method for constraining auto-regressive decoders of LLMs through incremental parsing. PICARD helps to find valid output sequences by rejecting inadmissible tokens at each decoding step. On the challenging Spider and CoSQL text-to-SQL translation tasks, we show that PICARD transforms fine-tuned T5 models with passable performance into state-of-the-art solutions.

Authors (3)

Torsten Scholak (14 papers)
Nathan Schucher (5 papers)
Dzmitry Bahdanau (46 papers)

Citations (333)

View on Semantic Scholar

Summary

Picard: Parsing Incrementally for Constrained Auto-Regressive Decoding from LLMs

This paper presents Picard, a method designed to enhance the constrained decoding capabilities of LLMs, specifically in the context of formal language outputs such as SQL. The primary challenge addressed is the tendency of pre-trained models to generate invalid SQL, which limits their usability in applications demanding high precision and adherence to formal specifications.

Methodology

Picard introduces an innovative approach by applying incremental parsing techniques to guide auto-regressive decoding. Unlike prior methods requiring custom vocabularies or architectures, Picard integrates seamlessly with existing LLM frameworks and can be activated during inference without necessitating changes in the model's pre-training or fine-tuning stages. This compatibility extends across models, including large transformers like T5.

The method operates by incrementally filtering out inadmissible tokens during decoding, thus ensuring the generated SQL adheres to syntactic and semantic constraints. Picard leverages monadic combinators for incremental parsing, allowing for multiple operation modes—including lexing and parsing with or without semantic guards. These modes progressively enhance the constraints enforced on the token predictions, thereby refining the validity of the generated queries.

Experimental Results

Empirical evaluations on the Spider and CoSQL datasets demonstrate the efficacy of Picard in improving the performance of fine-tuned T5 models. Particularly notable are the results with the T5-3B model, which achieves state-of-the-art performance on both datasets when augmented with Picard. For example, exact-set-match accuracy on the Spider test set reaches 71.9%, accompanied by execution accuracy of 75.1%.

Picard’s ability to significantly reduce invalid SQL generation is highlighted by the reduction in execution errors, notably from 12% in non-constrained setups to just 2% with Picard's implementation. This improvement is achieved without the need for excessively large beam sizes, contrasting favorably with other validity-filtering approaches which rely on beams of size 16 or more.

Implications and Future Work

The introduction of Picard underscores a pivotal advance in the domain of constrained decoding for natural LLMs, providing a robust solution for generating formal languages like SQL. The implications are substantial for enterprise applications where precision is paramount.

Future research could extend Picard’s capabilities with additional checks and constraints to further align generated queries with complex schema requirements. Moreover, exploring Picard’s applicability to domains beyond SQL could widen its utility in various formal language parsing tasks.

This work contributes significantly to the theory and practice of AI, showcasing a practical method for enhancing the reliability of LLMs in real-world applications while maintaining compatibility with existing model architectures and workflows.

PDF Markdown

Related Papers

Find Related Papers

GitHub

GitHub - ServiceNow/picard: PICARD - Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models. PICARD is a ServiceNow Research project that was started at Element AI. (359 stars)