Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
72 tokens/sec
GPT-4o
61 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Planning and Editing What You Retrieve for Enhanced Tool Learning (2404.00450v2)

Published 30 Mar 2024 in cs.CL, cs.AI, cs.IR, and cs.LG
Planning and Editing What You Retrieve for Enhanced Tool Learning

Abstract: Recent advancements in integrating external tools with LLMs have opened new frontiers, with applications in mathematical reasoning, code generators, and smart assistants. However, existing methods, relying on simple one-time retrieval strategies, fall short on effectively and accurately shortlisting relevant tools. This paper introduces a novel PLUTO (Planning, Learning, and Understanding for TOols) approach, encompassing Plan-and-Retrieve (P&R) and Edit-and-Ground (E&G) paradigms. The P&R paradigm consists of a neural retrieval module for shortlisting relevant tools and an LLM-based query planner that decomposes complex queries into actionable tasks, enhancing the effectiveness of tool utilization. The E&G paradigm utilizes LLMs to enrich tool descriptions based on user scenarios, bridging the gap between user queries and tool functionalities. Experiment results demonstrate that these paradigms significantly improve the recall and NDCG in tool retrieval tasks, significantly surpassing current state-of-the-art models.

Enhanced Tool Learning in LLMs Through Planning and Editing

Introduction to PLUTo

The integration of external tools with LLMs extends the functionality of AI applications into new domains like mathematical reasoning and smart assistants. Traditional methods rely on one-time retrieval strategies that often fail to consider the dynamism of real-world queries, resulting in a gap between the user's needs and the tools retrieved. To bridge this, the paper introduces a novel framework, PLUTo (Planning, Learning, and Understanding for Tools), incorporating two paradigms: Plan-and-Retrieve (P&R) and Edit-and-Ground (E&G). These paradigms collectively aim to enhance the retrieval and utility of tools in responding to complex user queries.

Plan-and-Retrieve (P&R) Paradigm

The P&R paradigm is a neural-network-based approach that employs a query planner and a retrieval module. This paradigm operates in three stages:

  1. Decomposition: The query planner decomposes complex user queries into more manageable sub-queries.
  2. Retrieval: For each sub-query, the retriever module shortlists relevant tools from a pool of candidates.
  3. Evaluation: The effectiveness of selected tools is continuously evaluated, adjusting the planning strategy to enhance retrieval accuracy.

Edit-and-Ground (E&G) Paradigm

The E&G paradigm improves tool descriptions to better match their functionalities with user scenarios. It utilizes user query context and LLMs' world knowledge to optimize tool descriptions, making them more informative and aligned with real-world applications. This process involves:

  1. Evaluation of Existing Descriptions: Identifying under-informative tool descriptions based on retrieval performance.
  2. Optimization: Leveraging LLM capabilities to generate enriched tool descriptions that detail functionalities in relation to user scenarios.

Key Results

The implementation of the PLUTo approach yielded significant improvements in tool retrieval tasks, outperforming current state-of-the-art models. Experiments demonstrated heightened recall and normalized discounted cumulative gain (NDCG), indicating a more effective and accurate tool retrieval process. Furthermore, downstream evaluation suggested improvements in response accuracy and relevance, highlighting PLUTo's ability to address complex queries successfully.

Practical and Theoretical Implications

PLUTo offers several advancements in the field of LLMs and tool integration, including:

  • Demonstrating the efficacy of planning and editing paradigms in enhancing tool retrieval.
  • Showcasing the flexibility of PLUTo in adapting to different retrieval engines.
  • Highlighting the potential of LLMs in automating and enriching tool descriptions based on real-world user scenarios.

Future Perspectives

While PLUTo marks a significant step forward, future research may focus on several areas:

  • Extending the PLUTo framework to multilingual settings to broaden its applicability.
  • Exploring further optimization techniques within the E&G paradigm to enhance tool descriptions continually.
  • Investigating the integration of PLUTo in more specialized domains such as healthcare or legal services, potentially unlocking new uses for LLM-enhanced tool learning.

Conclusion

The research introduces and validates PLUTo, a novel framework that significantly enhances tool learning in LLMs. By integrating the P&R and E&G paradigms, PLUTo not only improves the retrieval of relevant tools but also ensures that the tools' descriptions are optimized for practical applications. As a result, this framework stands as a promising advancement in the integration of LLMs with external tools, offering improved effectiveness and adaptability across various applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712.
  2. Reading wikipedia to answer open-domain questions. In 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, pages 1870–1879. Association for Computational Linguistics (ACL).
  3. Tanmay Gupta and Aniruddha Kembhavi. 2023. Visual programming: Compositional visual reasoning without training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14953–14962.
  4. Retrieval augmented language model pre-training. In International conference on machine learning, pages 3929–3938. PMLR.
  5. Toolkengpt: Augmenting frozen language models with massive tools via tool embeddings.
  6. Tool documentation enables zero-shot tool-usage with large language models. arXiv preprint arXiv:2308.00675.
  7. Unsupervised dense information retrieval with contrastive learning. Transactions on Machine Learning Research.
  8. Active retrieval augmented generation. arXiv preprint arXiv:2305.06983.
  9. Genegpt: Augmenting large language models with domain tools for improved access to biomedical information.
  10. Dense passage retrieval for open-domain question answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6769–6781, Online. Association for Computational Linguistics.
  11. Dspy: Compiling declarative language model calls into self-improving pipelines.
  12. Label efficient semi-supervised conversational intent classification. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pages 96–102, Toronto, Canada. Association for Computational Linguistics.
  13. Latent retrieval for weakly supervised open domain question answering. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 6086–6096, Florence, Italy. Association for Computational Linguistics.
  14. Retrieval-augmented generation for knowledge-intensive nlp tasks.
  15. Api-bank: A comprehensive benchmark for tool-augmented llms.
  16. Chameleon: Plug-and-play compositional reasoning with large language models.
  17. Dynamic prompt learning via policy gradient for semi-structured mathematical reasoning. In The Eleventh International Conference on Learning Representations.
  18. Interface evolution patterns: balancing compatibility and extensibility across service life cycles. pages 1–24.
  19. When not to trust language models: Investigating effectiveness of parametric and non-parametric memories. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 9802–9822, Toronto, Canada. Association for Computational Linguistics.
  20. Webgpt: Browser-assisted question-answering with human feedback.
  21. Gorilla: Large language model connected with massive apis. arXiv preprint arXiv:2305.15334.
  22. Creator: Tool creation for disentangling abstract and concrete reasoning of large language models.
  23. Toolllm: Facilitating large language models to master 16000+ real-world apis. arXiv preprint arXiv:2307.16789.
  24. Toolformer: Language models can teach themselves to use tools. arXiv preprint arXiv:2302.04761.
  25. Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face.
  26. Dialog2api: Task-oriented dialogue with api description and example programs. arXiv preprint arXiv:2212.09946.
  27. Llm-planner: Few-shot grounded planning for embodied agents with large language models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2998–3009.
  28. Restgpt: Connecting large language models with real-world restful apis.
  29. Toolalpaca: Generalized tool learning for language models with 3000 simulated cases.
  30. Medagents: Large language models as collaborators for zero-shot medical reasoning. arXiv preprint arXiv:2311.10537.
  31. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  32. Visual chatgpt: Talking, drawing and editing with visual foundation models.
  33. On the tool manipulation capability of open-source large language models. arXiv preprint arXiv:2305.16504.
  34. Exploring continual learning for code generation models. In The 61st Annual Meeting Of The Association For Computational Linguistics.
  35. Large language models for automated open-domain scientific hypotheses discovery. arXiv preprint arXiv:2309.02726.
  36. React: Synergizing reasoning and acting in language models. In The Eleventh International Conference on Learning Representations.
  37. Making retrieval-augmented language models robust to irrelevant context.
  38. Syntax error-free and generalizable tool use for llms via finite-state decoding.
  39. Furthest reasoning with plan assessment: Stable reasoning path with retrieval-augmented large language models. arXiv preprint arXiv:2309.12767.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Tenghao Huang (13 papers)
  2. Dongwon Jung (3 papers)
  3. Muhao Chen (159 papers)
Citations (5)