Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Exploring Autonomous Agents through the Lens of Large Language Models: A Review (2404.04442v1)

Published 5 Apr 2024 in cs.AI

Abstract: LLMs are transforming artificial intelligence, enabling autonomous agents to perform diverse tasks across various domains. These agents, proficient in human-like text comprehension and generation, have the potential to revolutionize sectors from customer service to healthcare. However, they face challenges such as multimodality, human value alignment, hallucinations, and evaluation. Techniques like prompting, reasoning, tool utilization, and in-context learning are being explored to enhance their capabilities. Evaluation platforms like AgentBench, WebArena, and ToolLLM provide robust methods for assessing these agents in complex scenarios. These advancements are leading to the development of more resilient and capable autonomous agents, anticipated to become integral in our digital lives, assisting in tasks from email responses to disease diagnosis. The future of AI, with LLMs at the forefront, is promising.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (117)
  1. A Survey on Large Language Model based Autonomous Agents, arXiv preprint arXiv:2308.11432, 2023.
  2. Is it an Agent, or just a Program?: A Taxonomy for Autonomous Agents, In International workshop on agent theories, architectures, and languages (pp. 21-35), Springer, 1996.
  3. Emergent autonomous scientific research capabilities of large language models, arXiv preprint arXiv:2304.05332, 2023.
  4. Four fundamentals of workplace automation, McKinsey Quarterly, 29(3), pp.1-9, 2015.
  5. AI, automation, and the future of work: Ten things to solve for, McKinsey Global Institute, 2018.
  6. Koru, O. F., Automation and top wealth inequality, 2020.
  7. Large Language Models Empowered Agent-based Modeling and Simulation: A Survey and Perspectives, arXiv preprint arXiv:2312.11970, 2023.
  8. May 2001. The semantic web, Scientific American, 78, 17.
  9. Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm, arXiv preprint arXiv:2102.07350, 2021.
  10. Model Tuning or Prompt Tuning? A Study of Large Language Models for Clinical Concept and Relation Extraction, arXiv preprint arXiv:2310.06239, 2023.
  11. Turning large language models into cognitive models, arXiv preprint arXiv:2306.03917, 2023.
  12. Cognitive Effects in Large Language Models, arXiv preprint arXiv:2308.14337, 2023.
  13. Exploring Large Language Model based Intelligent Agents: Definitions, Methods, and Prospects, arXiv preprint arXiv:2401.03428, 2024.
  14. Large Language Models for Robotics: Opportunities, Challenges, and Perspectives, arXiv preprint arXiv:2401.04334, 2024.
  15. RoCo: Dialectic Multi-Robot Collaboration with Large Language Models, arXiv preprint arXiv:2307.04738, 2023.
  16. Hassabis, D., AlphaGo: using machine learning to master the ancient game of Go, Google Blog, 27, 2016.
  17. Team, AlphaFold, AlphaFold: A Solution to a 50-year-old grand challenge in biology, DeepMind, November, 30, 2020.
  18. Competition-level code generation with AlphaCode, Science, 378(6624), pp.1092-1097, 2022.
  19. Chase, H., LangChain, https://github.com/hwchase17/langchain, 2022.
  20. Significant Gravitas, AutoGPT, https://github.com/Significant-Gravitas/AutoGPT.
  21. MedFuseNet: An attention-based multimodal deep learning model for visual question answering in the medical domain, Scientific Reports, 11(1), pp.19826, Nature Publishing Group UK London, 2021.
  22. LLaMA: Open and Efficient Foundation Language Models, arXiv preprint arXiv:2302.13971, 2023.
  23. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer, CoRR, abs/1910.10683, 2019, http://arxiv.org/abs/1910.10683.
  24. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension, CoRR, abs/1910.13461, 2019, http://arxiv.org/abs/1910.13461.
  25. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, arXiv preprint arXiv:1810.04805, 2019.
  26. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter, arXiv preprint arXiv:1910.01108, 2020.
  27. RoBERTa: A Robustly Optimized BERT Pretraining Approach, arXiv preprint arXiv:1907.11692, 2019.
  28. Sherstinsky, A., Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network, Physica D: Nonlinear Phenomena, 404, pp.132306, Elsevier BV, 2020.
  29. GPT-4 Technical Report, arXiv preprint arXiv:2303.08774, 2023.
  30. ChemCrow: Augmenting large-language models with chemistry tools, arXiv preprint arXiv:2304.05376, 2023.
  31. Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent, arXiv preprint arXiv:2312.08926, 2023.
  32. AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation, arXiv preprint arXiv:2308.08155, 2023.
  33. Large Language Models are Few-shot Testers: Exploring LLM-based General Bug Reproduction, arXiv preprint arXiv:2209.11515, 2023.
  34. ChatLaw: Open-Source Legal Large Language Model with Integrated External Knowledge Bases, arXiv preprint arXiv:2306.16092, 2023.
  35. ProAgent: Building Proactive Cooperative Agents with Large Language Models, arXiv preprint arXiv:2308.11339, 2024.
  36. Explainable agents and robots: Results from a systematic literature review, In 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2019), Montreal, Canada, May 13–17, 2019 (pp. 1078-1088), International Foundation for Autonomous Agents and Multiagent Systems, 2019.
  37. Formally Specifying the High-Level Behavior of LLM-Based Agents, arXiv preprint arXiv:2310.08535, 2023.
  38. KAXAI: An Integrated Environment for Knowledge Analysis and Explainable AI, arXiv preprint arXiv:2401.00193, 2023.
  39. AVIS: Autonomous Visual Information Seeking with Large Language Model Agent, arXiv preprint arXiv:2306.08129, 2023.
  40. Review of Large Vision Models and Visual Prompt Engineering, arXiv preprint arXiv:2307.00855, 2023.
  41. Vision Language Models in Autonomous Driving and Intelligent Transportation Systems, arXiv preprint arXiv:2310.14414, 2023.
  42. Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages, arXiv preprint arXiv:2303.01037, 2023.
  43. LLaSM: Large Language and Speech Model, arXiv preprint arXiv:2308.15930, 2023.
  44. Prompting Large Language Models with Speech Recognition Abilities, arXiv preprint arXiv:2307.11795, 2023.
  45. Bongard, J. C., Biologically Inspired Computing, Computer, 42(4), pp.95-98, 2009.
  46. Empowering Working Memory for Large Language Model Agents, arXiv preprint arXiv:2312.17259, 2023.
  47. Efficient Memory Management for Large Language Model Serving with PagedAttention, arXiv preprint arXiv:2309.06180, 2023.
  48. MemGPT: Towards LLMs as Operating Systems, arXiv preprint arXiv:2310.08560, 2023.
  49. LLM in a flash: Efficient Large Language Model Inference with Limited Memory, arXiv preprint arXiv:2312.11514, 2024.
  50. RET-LLM: Towards a General Read-Write Memory for Large Language Models, arXiv preprint arXiv:2305.14322, 2023.
  51. Can LLMs Effectively Leverage Graph Structural Information: When and Why, arXiv preprint arXiv:2309.16595, 2023.
  52. GPT4Graph: Can Large Language Models Understand Graph Structured Data ? An Empirical Evaluation and Benchmarking, arXiv preprint arXiv:2305.15066, 2023.
  53. Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding, arXiv preprint arXiv:2307.15337, 2023.
  54. Translating Natural Language to Planning Goals with Large-Language Models, arXiv preprint arXiv:2302.05128, 2023.
  55. EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction, arXiv preprint arXiv:2401.06201, 2024.
  56. Large Language Models as Tool Makers, arXiv preprint arXiv:2305.17126, 2023.
  57. Mouret, J-B., Large language models help computer programs to evolve, Nature Publishing Group, 2024.
  58. Solving olympiad geometry without human demonstrations, Nature, 625(7995), pp.476-482, Nature Publishing Group, 2024.
  59. LiteLLM Documentation, Accessed: 22 January, 2024.
  60. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, arXiv preprint arXiv:2201.11903, 2023.
  61. HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face, arXiv preprint arXiv:2303.17580, 2023.
  62. A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions, arXiv preprint arXiv:2311.05232, 2023.
  63. Progressive-Hint Prompting Improves Reasoning in Large Language Models, arXiv preprint arXiv:2304.09797, 2023.
  64. Tree of Thoughts: Deliberate Problem Solving with Large Language Models, arXiv preprint arXiv:2305.10601, 2023.
  65. Self-Consistency Improves Chain of Thought Reasoning in Language Models, arXiv preprint arXiv:2203.11171, 2023.
  66. Graph of Thoughts: Solving Elaborate Problems with Large Language Models, arXiv preprint arXiv:2308.09687, 2023.
  67. ReAct: Synergizing Reasoning and Acting in Language Models, arXiv preprint arXiv:2210.03629, 2023.
  68. Reflexion: Language Agents with Verbal Reinforcement Learning, arXiv preprint arXiv:2303.11366, 2023.
  69. Fine-tuning language models from human preferences, arXiv preprint arXiv:1909.08593, 2019.
  70. Direct Preference Optimization: Your Language Model is Secretly a Reward Model, arXiv preprint arXiv:2305.18290, 2023.
  71. MemoryBank: Enhancing Large Language Models with Long-Term Memory, arXiv preprint arXiv:2305.10250, 2023.
  72. Evaluating Generative Ad Hoc Information Retrieval, arXiv preprint arXiv:2311.04694, 2023.
  73. ART: Automatic multi-step reasoning and tool-use for large language models, arXiv preprint arXiv:2303.09014, 2023.
  74. Active Prompting with Chain-of-Thought for Large Language Models, arXiv preprint arXiv:2302.12246, 2023.
  75. Guiding Large Language Models via Directional Stimulus Prompting, In NeurIPS 2023, 2023.
  76. A Survey of Graph Prompting Methods: Techniques, Applications, and Challenges, arXiv preprint arXiv:2303.07275, 2023.
  77. Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts, arXiv preprint arXiv:2210.11292, 2022.
  78. Prefix-Tuning: Optimizing Continuous Prompts for Generation, arXiv preprint arXiv:2101.00190, 2021.
  79. Diversity-Aware Meta Visual Prompting, arXiv preprint arXiv:2303.08138, 2023.
  80. A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models, arXiv preprint arXiv:2307.12980, 2023.
  81. Prompting with Pseudo-Code Instructions, arXiv preprint arXiv:2305.11790, 2023.
  82. mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model, arXiv preprint arXiv:2311.18248, 2024.
  83. LLM-based Control Code Generation using Image Recognition, arXiv preprint arXiv:2311.10401, 2023.
  84. Improving Knowledge Extraction from LLMs for Task Learning through Agent Analysis, arXiv preprint arXiv:2306.06770, 2023.
  85. From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models, OpenReview preprint https://openreview.net/forum?id=Ck1UtnVukP8, 2023.
  86. Attention Is All You Need, arXiv preprint arXiv:1706.03762, 2023.
  87. Efficient Streaming Language Models with Attention Sinks, arXiv preprint arXiv:2309.17453, 2023.
  88. Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models, arXiv preprint arXiv:2308.15022, 2023.
  89. Generated Knowledge Prompting for Commonsense Reasoning, arXiv preprint arXiv:2110.08387, 2022.
  90. Leveraging Large Language Models in Conversational Recommender Systems, arXiv preprint arXiv:2305.07961, 2023.
  91. Increase the Diversity of Your Dataset with Data Augmentation, State-of-the-Art Deep Learning Models in TensorFlow: Modern Machine Learning in the Google Colab Ecosystem, pp.37-64, Springer, 2021.
  92. Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations, arXiv preprint arXiv:2310.06387, 2023.
  93. The Vendi Score: A Diversity Evaluation Metric for Machine Learning, arXiv preprint arXiv:2210.02410, 2023.
  94. Ignore previous prompt: Attack techniques for language models, arXiv preprint arXiv:2211.09527, 2022.
  95. VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks, arXiv preprint arXiv:2401.13649, 2024.
  96. Embodied intelligence in manufacturing: leveraging large language models for autonomous industrial robotics, Journal of Intelligent Manufacturing, pp.1-17, Springer, 2024.
  97. Large Language Model based Multi-Agents: A Survey of Progress and Challenges, arXiv preprint arXiv:2402.01680, 2024.
  98. LimSim++: A Closed-Loop Platform for Deploying Multimodal LLMs in Autonomous Driving, arXiv preprint arXiv:2402.01246, 2024.
  99. Aligning Large Language Models with Human: A Survey, arXiv preprint arXiv:2307.12966, 2023.
  100. Evil Geniuses: Delving into the Safety of LLM-based Agents, arXiv preprint arXiv:2311.11855, 2024.
  101. Towards better Human-Agent Alignment: Assessing Task Utility in LLM-Powered Applications, arXiv preprint arXiv:2402.09015, 2024.
  102. Mitigating Large Language Model Hallucinations via Autonomous Knowledge Graph-based Retrofitting, arXiv preprint arXiv:2311.13314, 2023.
  103. LLM Lies: Hallucinations are not Bugs, but Features as Adversarial Examples, arXiv preprint arXiv:2310.01469, 2023.
  104. Creating Trustworthy LLMs: Dealing with Hallucinations in Healthcare AI, arXiv preprint arXiv:2311.01463, 2023.
  105. Self-Adaptive Large Language Model (LLM)-Based Multiagent Systems, arXiv preprint arXiv:2307.06187, 2023.
  106. Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science, arXiv preprint arXiv:2402.04247, 2024.
  107. AgentLens: Visual Analysis for Agent Behaviors in LLM-based Autonomous Systems, arXiv preprint arXiv:2402.08995, 2024.
  108. Test and Evaluation Framework for Multi-Agent Systems of Autonomous Intelligent Agents, In 2021 16th International Conference of System of Systems Engineering (SoSE), IEEE, 2021.
  109. Evaluating Language-Model Agents on Realistic Autonomous Tasks, arXiv preprint arXiv:2312.11671, 2024.
  110. Test and Evaluation Framework for Multi-Agent Systems of Autonomous Intelligent Agents, In 2021 16th International Conference of System of Systems Engineering (SoSE), pp.203-209, IEEE, 2021.
  111. The Rise and Potential of Large Language Model Based Agents: A Survey, arXiv preprint arXiv:2309.07864, 2023.
  112. Understanding the planning of LLM agents: A survey, arXiv preprint arXiv:2402.02716, 2024.
  113. AgentBench: Evaluating LLMs as Agents, arXiv preprint arXiv:2308.03688, 2023.
  114. Towards Autonomous Testing Agents via Conversational Large Language Models, arXiv preprint arXiv:2306.05152, 2023.
  115. Enhancing Trust in LLM-Based AI Automation Agents: New Considerations and Future Challenges, arXiv preprint arXiv:2308.05391, 2023.
  116. WebArena: A Realistic Web Environment for Building Autonomous Agents, arXiv preprint arXiv:2307.13854, 2023.
  117. ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs, arXiv preprint arXiv:2307.16789, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Saikat Barua (5 papers)
Citations (6)