Sketch-Guided Constrained Decoding for Boosting Blackbox Large Language Models without Logit Access (2401.09967v4)
Abstract: Constrained decoding, a technique for enforcing constraints on LLM outputs, offers a way to control text generation without retraining or architectural modifications. Its application is, however, typically restricted to models that give users access to next-token distributions (usually via softmax logits), which poses a limitation with blackbox LLMs. This paper introduces sketch-guided constrained decoding (SGCD), a novel approach to constrained decoding for blackbox LLMs, which operates without access to the logits of the blackbox LLM. SGCD utilizes a locally hosted auxiliary model to refine the output of an unconstrained blackbox LLM, effectively treating this initial output as a "sketch" for further elaboration. This approach is complementary to traditional logit-based techniques and enables the application of constrained decoding in settings where full model transparency is unavailable. We demonstrate the efficacy of SGCD through experiments in closed information extraction and constituency parsing, showing how it enhances the utility and flexibility of blackbox LLMs for complex NLP tasks.
- Guiding Language Models of Code with Global Context using Monitors.
- Prompting is programming: A query language for large language models. Proceedings of the ACM on Programming Languages, 7(PLDI):1946–1969.
- Language models are few-shot learners.
- Sparks of artificial general intelligence: Early experiments with gpt-4.
- Xiang Chen and Xiaojun Wan. 2023. A comprehensive evaluation of constrained text generation for large language models.
- KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 14035–14053, Singapore. Association for Computational Linguistics.
- A general-purpose algorithm for constrained sequential inference. In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), pages 482–492, Hong Kong, China. Association for Computational Linguistics.
- Grammar-constrained decoding for structured NLP tasks without finetuning. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 10932–10952, Singapore. Association for Computational Linguistics.
- Lazy-k Decoding: Constrained Decoding for Information Extraction. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6727–6736, Singapore. Association for Computational Linguistics.
- Measuring massive multitask language understanding. Proceedings of the International Conference on Learning Representations (ICLR).
- Chris Hokamp and Qun Liu. 2017. Lexically constrained decoding for sequence generation using grid beam search. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1535–1546, Vancouver, Canada. Association for Computational Linguistics.
- How to index item ids for recommendation foundation models. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region, SIGIR-AP ’23. ACM.
- Grounded Decoding: Guiding Text Generation with Grounded Models for Embodied Agents.
- Pere-Lluís Huguet Cabot and Roberto Navigli. 2021. REBEL: Relation extraction by end-to-end language generation. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 2370–2381, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- GenIE: Generative information extraction. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4626–4643, Seattle, United States. Association for Computational Linguistics.
- Exploiting asymmetry for synthetic training data generation: SynthIE and the case of information extraction. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 1555–1574, Singapore. Association for Computational Linguistics.
- Holistic evaluation of text-to-image models.
- Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2):313–330.
- Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding. ArXiv:2307.15337 [cs].
- Certified deductive reasoning with language models.
- Synchromesh: Reliable code generation from pre-trained language models. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net.
- Aarne Ranta. 2019. Grammatical framework: an interlingual grammar formalism. In Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing, pages 1–2, Dresden, Germany. Association for Computational Linguistics.
- Toolformer: Language models can teach themselves to use tools.
- PICARD: Parsing incrementally for constrained auto-regressive decoding from language models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 9895–9901, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Satoshi Sekine and Michael Collins. 2008. Evalb: Bracket scoring program.
- Lm-nav: Robotic navigation with large pre-trained models of language, vision, and action.
- Constrained language models yield few-shot semantic parsers. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7699–7715, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Neural Relation Extraction for Knowledge Base Enrichment. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 229–240, Florence, Italy. Association for Computational Linguistics.
- Small language models improve giants by rewriting their outputs.
- Grammar prompting for domain-specific language generation with large language models.
- Generating sequences by learning to self-correct. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net.
- Syntax error-free and generalizable tool use for llms via finite-state decoding.
- Saibo Geng (8 papers)
- Berkay Döner (1 paper)
- Chris Wendler (22 papers)
- Martin Josifoski (17 papers)
- Robert West (154 papers)