Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 167 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 31 tok/s Pro
GPT-5 High 31 tok/s Pro
GPT-4o 106 tok/s Pro
Kimi K2 187 tok/s Pro
GPT OSS 120B 443 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

AutoGuide: Automated Generation and Selection of Context-Aware Guidelines for Large Language Model Agents (2403.08978v2)

Published 13 Mar 2024 in cs.CL and cs.LG

Abstract: Recent advances in LLMs have empowered AI agents capable of performing various sequential decision-making tasks. However, effectively guiding LLMs to perform well in unfamiliar domains like web navigation, where they lack sufficient knowledge, has proven to be difficult with the demonstration-based in-context learning paradigm. In this paper, we introduce a novel framework, called AutoGuide, which addresses this limitation by automatically generating context-aware guidelines from offline experiences. Importantly, each context-aware guideline is expressed in concise natural language and follows a conditional structure, clearly describing the context where it is applicable. As a result, our guidelines facilitate the provision of relevant knowledge for the agent's current decision-making process, overcoming the limitations of the conventional demonstration-based learning paradigm. Our evaluation demonstrates that AutoGuide significantly outperforms competitive baselines in complex benchmark domains, including real-world web navigation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Learning to win by reading manuals in a monte-carlo framework. Journal of Artificial Intelligence Research, 43:661–704, 2012.
  2. Do as i can, not as i say: Grounding language in robotic affordances. In Conference on Robot Learning, pp.  287–318. PMLR, 2023.
  3. Mind2web: Towards a generalist agent for the web. arXiv preprint arXiv:2306.06070, 2023.
  4. A survey for in-context learning. arXiv preprint arXiv:2301.00234, 2022.
  5. Pal: Program-aided language models. In International Conference on Machine Learning, pp.  10764–10799. PMLR, 2023.
  6. Understanding HTML with large language models. In Bouamor, H., Pino, J., and Bali, K. (eds.), Findings of the Association for Computational Linguistics: EMNLP 2023, pp.  2803–2821, Singapore, December 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.findings-emnlp.185. URL https://aclanthology.org/2023.findings-emnlp.185.
  7. Grounding language to entities and dynamics for generalization in reinforcement learning. In International Conference on Machine Learning, pp.  4051–4062. PMLR, 2021.
  8. Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In International Conference on Machine Learning, pp.  9118–9147. PMLR, 2022.
  9. Challenges and applications of large language models, 2023.
  10. Language models can solve computer tasks. arXiv preprint arXiv:2303.17491, 2023.
  11. Visualwebarena: Evaluating multimodal agents on realistic visual web tasks. arXiv preprint arXiv:2401.13649, 2024.
  12. Few-shot subgoal planning with language models. In NAACL: HLT, 2022.
  13. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. In ACL, 2022.
  14. Self-refine: Iterative refinement with self-feedback. arXiv preprint arXiv:2303.17651, 2023.
  15. Rethinking the role of demonstrations: What makes in-context learning work? In EMNLP, 2022.
  16. Talm: Tool augmented language models. arXiv preprint arXiv:2205.12255, 2022.
  17. Gorilla: Large language model connected with massive apis. arXiv preprint arXiv:2305.15334, 2023.
  18. Toolllm: Facilitating large language models to master 16000+ real-world apis. arXiv preprint arXiv:2307.16789, 2023.
  19. Toolformer: Language models can teach themselves to use tools. arXiv preprint arXiv:2302.04761, 2023.
  20. Reflexion: Language agents with verbal reinforcement learning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  21. ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
  22. ALFWorld: Aligning Text and Embodied Environments for Interactive Learning. In ICLR, 2021.
  23. Adaplanner: Adaptive planning from feedback with language models. arXiv preprint arXiv:2305.16653, 2023.
  24. Reinforcement Learning: An Introduction. The MIT Press, second edition, 2018. URL http://incompleteideas.net/book/the-book-2nd.html.
  25. A survey on large language model based autonomous agents. arXiv preprint arXiv:2308.11432, 2023.
  26. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837, 2022.
  27. The rise and potential of large language model based agents: A survey. arXiv preprint arXiv:2309.07864, 2023.
  28. Webshop: Towards scalable real-world web interaction with grounded language agents. Advances in Neural Information Processing Systems, 35:20744–20757, 2022a.
  29. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629, 2022b.
  30. Do large language models know what they don’t know? In Findings of ACL, 2023.
  31. Agenttuning: Enabling generalized agent abilities for llms. arXiv preprint arXiv:2310.12823, 2023.
  32. Expel: Llm agents are experiential learners. arXiv preprint arXiv:2308.10144, 2023.
  33. Gpt-4v(ision) is a generalist web agent, if grounded. arXiv preprint arXiv:2401.01614, 2024.
  34. Rtfm: Generalising to new environment dynamics via reading. In ICLR, pp.  1–17. ICLR, 2020.
  35. Webarena: A realistic web environment for building autonomous agents. arXiv preprint arXiv:2307.13854, 2023.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 7 likes.

Upgrade to Pro to view all of the tweets about this paper: