Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 76 tok/s
Gemini 2.5 Pro 52 tok/s Pro
GPT-5 Medium 19 tok/s Pro
GPT-5 High 26 tok/s Pro
GPT-4o 80 tok/s Pro
Kimi K2 210 tok/s Pro
GPT OSS 120B 466 tok/s Pro
Claude Sonnet 4.5 33 tok/s Pro
2000 character limit reached

Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents (2402.09205v2)

Published 14 Feb 2024 in cs.CL, cs.AI, and cs.HC

Abstract: Current LLM-driven agents often lack mechanisms for effective user participation, which is crucial given the vagueness commonly found in user instructions. Although adept at devising strategies and performing tasks, these agents struggle with seeking clarification and grasping precise user intentions. To bridge this gap, we introduce Intention-in-Interaction (IN3), a novel benchmark designed to inspect users' implicit intentions through explicit queries. Next, we propose the incorporation of model experts as the upstream in agent designs to enhance user-agent interaction. Employing IN3, we empirically train Mistral-Interact, a powerful model that proactively assesses task vagueness, inquires user intentions, and refines them into actionable goals before starting downstream agent task execution. Integrating it into the XAgent framework, we comprehensively evaluate the enhanced agent system regarding user instruction understanding and execution, revealing that our approach notably excels at identifying vague user tasks, recovering and summarizing critical missing information, setting precise and necessary agent execution goals, and minimizing redundant tool usage, thus boosting overall efficiency. All the data and codes are released.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. Do as i can, not as i say: Grounding language in robotic affordances. arXiv preprint arXiv:2204.01691.
  2. AutoGPT. 2023. Autogpt.
  3. BabyAGI. 2023. Babyagi.
  4. Graph of thoughts: Solving elaborate problems with large language models. arXiv preprint arXiv:2308.09687.
  5. Predicting user intents and satisfaction with dialogue-based conversational recommendations.
  6. Large language models as tool makers. In The Twelfth International Conference on Learning Representations.
  7. Chateval: Towards better llm-based evaluators through multi-agent debate.
  8. Understanding user intent in community question answering. In Proceedings of the 21st international conference on world wide web, pages 823–828.
  9. Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system.
  10. A methodological approach to create interactive art in artificial intelligence. In HCI International 2020 – Late Breaking Papers: Cognition, Learning and Games: 22nd HCI International Conference, HCII 2020, Copenhagen, Denmark, July 19–24, 2020, Proceedings, page 13–31, Berlin, Heidelberg. Springer-Verlag.
  11. Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors. In The Twelfth International Conference on Learning Representations.
  12. Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 24(240):1–113.
  13. Eva A. C. Bittner Christina Wiethof, Navid Tavanapour. 2021. Implementing an intelligent collaborative agent as teammate in collaborative writing: toward a synergy of humans and ai.
  14. Pal: Program-aided language models. In International Conference on Machine Learning, pages 10764–10799. PMLR.
  15. Tanmay Gupta and Aniruddha Kembhavi. 2023. Visual programming: Compositional visual reasoning without training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14953–14962.
  16. Reasoning with language model is planning with world model. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 8154–8173, Singapore. Association for Computational Linguistics.
  17. Chatdb: Augmenting llms with databases as their symbolic memory.
  18. Understanding user’s query intent with wikipedia. In Proceedings of the 18th international conference on World wide web, pages 471–480.
  19. Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In International Conference on Machine Learning, pages 9118–9147. PMLR.
  20. Mistral 7b.
  21. Genegpt: Augmenting large language models with domain tools for improved access to biomedical information. ArXiv.
  22. Human-centric research for nlp: Towards a definition and guiding questions.
  23. Hui-Chi Kuo and Yun-Nung Chen. 2023. Zero-shot prompting for implicit intent prediction and recommendation with commonsense reasoning.
  24. Camel: Communicative agents for "mind" exploration of large language model society.
  25. Gitagent: Facilitating autonomous agent with github by tool extension.
  26. McAuley Julian Majumder Bodhisattwa Prasad. 2023. User-centric natural language processing. UC San Diego Electronic Theses and Dissertations.
  27. modelcenter. 2023. modelcenter. https://github.com/OpenBMB/ModelCenter.
  28. Webgpt: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332.
  29. OpenAI. 2022. Chatgpt.
  30. OpenAI. 2023. Gpt-4 technical report.
  31. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pages 1–22.
  32. Gorilla: Large language model connected with massive apis.
  33. Communicative agents for software development.
  34. Creator: Tool creation for disentangling abstract and concrete reasoning of large language models. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 6922–6939.
  35. Toolink: Linking toolkit creation and using through chain-of-solving on open-source model.
  36. Tool learning with foundation models.
  37. Toolllm: Facilitating large language models to master 16000+ real-world apis. In The Twelfth International Conference on Learning Representations.
  38. Analyzing and characterizing user intent in information-seeking conversations. In The 41st international acm sigir conference on research & development in information retrieval, pages 989–992.
  39. Toolformer: Language models can teach themselves to use tools. In Thirty-seventh Conference on Neural Information Processing Systems.
  40. Algorithm of thoughts: Enhancing exploration of ideas in large language models. arXiv preprint arXiv:2308.10379.
  41. Sparse hidden-dynamics conditional random fields for user intent understanding. In Proceedings of the 20th international conference on World wide web, pages 7–16.
  42. Hugginggpt: Solving ai tasks with chatgpt and its friends in huggingface. arXiv preprint arXiv:2303.17580.
  43. Reflexion: Language agents with verbal reinforcement learning. In Thirty-seventh Conference on Neural Information Processing Systems.
  44. Kieran O Sullivan. 2018. Comparing the effectiveness of support vector machines and convolutional neural networks for determining user intent in conversational agents.
  45. Llama: Open and efficient foundation language models.
  46. Llama 2: Open foundation and fine-tuned chat models.
  47. Bayes and naive bayes classifier.
  48. Voyager: An open-ended embodied agent with large language models. arXiv preprint arXiv:2305.16291.
  49. A survey on large language model based autonomous agents. arXiv preprint arXiv:2308.11432.
  50. Self-instruct: Aligning language models with self-generated instructions. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13484–13508, Toronto, Canada. Association for Computational Linguistics.
  51. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
  52. Visual chatgpt: Talking, drawing and editing with visual foundation models. arXiv preprint arXiv:2303.04671.
  53. Autogen: Enabling next-gen llm applications via multi-agent conversation.
  54. Ai creativity and the human-ai co-creation model. In Human-Computer Interaction. Theory, Methods and Tools: Thematic Area, HCI 2021, Held as Part of the 23rd HCI International Conference, HCII 2021, Virtual Event, July 24–29, 2021, Proceedings, Part I, page 171–190, Berlin, Heidelberg. Springer-Verlag.
  55. XAgent-Team. 2023. Xagent: An autonomous agent for complex task solving.
  56. The rise and potential of large language model based agents: A survey. arXiv preprint arXiv:2309.07864.
  57. Tree of thoughts: Deliberate problem solving with large language models. In Thirty-seventh Conference on Neural Information Processing Systems.
  58. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629.
  59. Large language model as autonomous decision maker. arXiv preprint arXiv:2308.12519.
  60. Proagent: From robotic process automation to agentic process automation.
  61. Glm-130b: An open bilingual pre-trained model. In The Eleventh International Conference on Learning Representations.
  62. Wider and deeper llm networks are fairer llm evaluators. arXiv preprint arXiv:2308.01862.
Citations (13)

Summary

  • The paper introduces the Intention-in-Interaction (IN3) benchmark to quantitatively assess how well agents discern vague user instructions.
  • The paper presents Mistral-Interact, an expert model that actively queries users to extract hidden intentions before task execution.
  • Empirical results demonstrate that integrating Mistral-Interact reduces execution errors and improves overall task efficiency.

Enhancing LLM-Driven Agents with Implicit User Intention Understanding

Introduction

LLM-driven agents have significantly advanced in executing tasks directly from user instructions. However, a notable challenge persists in these agents' inability to effectively solicit user participation, especially when instructions are vague. This limitation frequently leads to "fake success" instances where the outcome superficially meets the instruction but misses the user's true intention. Addressing this, the paper proposes a new concept, Intention-in-Interaction (IN3), designed to aid in understanding users’ implicit intentions through structured interaction, thus paving the way for more effective task execution.

User-Agent Interaction Gap

Despite the remarkable capabilities displayed by state-of-the-art LLMs in text, code generation, and logical reasoning, their application in agent systems often fails to account for the nuanced and varied intentions different users might have. This oversight limits the robustness and efficiency of the agent, as it cannot discern the actual intention behind vague instructions or engage the user effectively to clarify. Current benchmarks for assessing agent designs do not consider the importance of clarifying user intentions, leading to a critical gap in the evaluation methodology.

Intention-in-Interaction Benchmark

To address this gap, the paper introduces the Intention-in-Interaction (IN3) benchmark. IN3 is designed with a user-centric perspective, providing a structured framework that focuses on quantitively measuring how well an agent can discern task vagueness and interact with the user to uncover hidden intentions. The benchmark includes a wide range of tasks across various categories, each annotated with details regarding its vagueness and missing critical details, simulating real-world scenarios where users might not provide complete instructions.

Enriching Agent-User Interaction

Building on the IN3 benchmark, the research innovates further by introducing an expert model designed specifically for interacting with users to extract those implicit intentions before executing any tasks. Named Mistral-Interact, this model is trained on simulated dialogues and employs strategies like explicit initial thought, querying with options, and accommodating diverse user tones to effectively communicate with users. This model differentiates itself by actively questioning to fill in the gaps of user instructions, significantly enhancing the clarity of tasks before they are executed by the agent.

Empirical Validation and Implications

Extensive experiments validate the effectiveness of incorporating Mistral-Interact into the XAgent framework. Compared to baseline agents, the enhanced agent system demonstrates superior performance in identifying and clarifying vague tasks, reducing unnecessary or overly general task components, and streamlining tool usage during task execution. These improvements underline the practical benefits of integrating specialized interaction expertise into agent systems, suggesting a promising direction for future developments in agent design.

Future Directions

The paper offers a novel approach to enhancing user-agent interaction through explicit intention understanding, represented by the development and implementation of Mistral-Interact. Despite this progress, further research is necessary to explore the integration of user interactions during agent task execution, expand and refine metrics for assessing interaction quality, and consider the usage of LLMs for simulating realistic user-agent dialogues. These areas hold potential for significantly advancing the field of LLM-driven agent systems, contributing to more personalized, efficient, and user-aligned task execution.

Conclusion

The paper presents a comprehensive approach towards closing the user-agent interaction gap in task execution systems, highlighted by the introduction of the IN3 benchmark and the integration of Mistral-Interact into the agent design. The empirical success of these contributions signifies a critical step towards realizing fully functional, user-centric agent systems capable of navigating the complexities of real-world tasks and user intentions. This work lays a foundation for future explorations into enhancing the interaction between users and AI agents, ultimately contributing to the development of more intuitive, effective, and efficient AI-driven solutions.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 5 posts and received 21 likes.

Youtube Logo Streamline Icon: https://streamlinehq.com