Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
51 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Don't Fine-Tune, Decode: Syntax Error-Free Tool Use via Constrained Decoding (2310.07075v3)

Published 10 Oct 2023 in cs.CL and cs.AI

Abstract: Instruction-tuned LLMs excel at many tasks but often fail to use external tools due to complicated and unfamiliar syntax constraints. While extensive fine-tuning and prompting can mitigate the issue, these approaches are expensive and hard to generalize. Furthermore, because syntax constraints are only learned implicitly during fine-tuning, models still make frequent syntax errors. Motivated by the fact that these constraints can be better satisfied explicitly with constrained decoding, we propose TOOLDEC, a decoding algorithm using finite state machines to force LLMs to follow tool syntax. Our experiments show that TOOLDEC eliminates all syntax errors, achieving significantly better performance on various base models and benchmarks. More surprisingly, when applied to generalist out-of-the-box LLMs such as Mistral-Instruct, TOOLDEC improves its accuracy in tool use from the initial 0% to an impressive 52%, matching the performance of specialized fine-tuned models such as ToolLLM.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Guided open vocabulary image captioning with constrained beam search. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp.  936–945, Copenhagen, Denmark, September 2017. Association for Computational Linguistics. doi: 10.18653/v1/D17-1098. URL https://aclanthology.org/D17-1098.
  2. Improving language models by retrieving from trillions of tokens. In International conference on machine learning, pp. 2206–2240. PMLR, 2022.
  3. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  4. Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv e-prints, pp.  arXiv–2211, 2022.
  5. Jason Eisner. Parameter estimation for probabilistic finite-state transducers. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp.  1–8, 2002.
  6. Edward Fredkin. Trie memory. Communications of the ACM, 3(9):490–499, 1960.
  7. Pal: Program-aided language models. In International Conference on Machine Learning, pp. 10764–10799. PMLR, 2023.
  8. Visual programming: Compositional visual reasoning without training. ArXiv, abs/2211.11559, 2022.
  9. Retrieval augmented language model pre-training. In International conference on machine learning, pp. 3929–3938. PMLR, 2020.
  10. Toolkengpt: Augmenting frozen language models with massive tools via tool embeddings. arXiv preprint arXiv:2305.11554, 2023.
  11. Lexically constrained decoding for sequence generation using grid beam search. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.  1535–1546, Vancouver, Canada, July 2017. Association for Computational Linguistics. doi: 10.18653/v1/P17-1141. URL https://aclanthology.org/P17-1141.
  12. Kamel : Knowledge analysis with multitoken entities in language models. Automated Knowledge Base Construction, 2022.
  13. Design patterns and extensibility of rest api for networking applications. IEEE Transactions on Network and Service Management, 13(1):154–167, 2016.
  14. Neurologic decoding:(un) supervised neural text generation with predicate logic constraints. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.  4288–4299, 2021.
  15. Neurologic a* esque decoding: Constrained text generation with lookahead heuristics. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.  780–799, 2022.
  16. Augmented language models: a survey. arXiv preprint arXiv:2302.07842, 2023.
  17. Cgmh: Constrained sentence generation by metropolis-hastings sampling. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pp.  6834–6842, 2019.
  18. Webgpt: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332, 2021.
  19. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
  20. Talm: Tool augmented language models. arXiv preprint arXiv:2205.12255, 2022.
  21. Toolllm: Facilitating large language models to master 16000+ real-world apis. arXiv preprint arXiv:2307.16789, 2023.
  22. Weighting finite-state transductions with neural context. In Proceedings of the 2016 conference of the North American chapter of the Association for Computational Linguistics: human language technologies, pp.  623–633, 2016.
  23. Toolformer: Language models can teach themselves to use tools. arXiv preprint arXiv:2302.04761, 2023.
  24. Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face, 2023.
  25. Restgpt: Connecting large language models with real-world restful apis, 2023.
  26. Llama: Open and efficient foundation language models, 2023.
  27. React: Synergizing reasoning and acting in language models, 2023.
  28. Judging llm-as-a-judge with mt-bench and chatbot arena, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Kexun Zhang (21 papers)
  2. Hongqiao Chen (1 paper)
  3. Lei Li (1293 papers)
  4. William Wang (38 papers)
Citations (4)
Github Logo Streamline Icon: https://streamlinehq.com