Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
81 tokens/sec
Gemini 2.5 Pro Premium
33 tokens/sec
GPT-5 Medium
31 tokens/sec
GPT-5 High Premium
22 tokens/sec
GPT-4o
78 tokens/sec
DeepSeek R1 via Azure Premium
92 tokens/sec
GPT OSS 120B via Groq Premium
436 tokens/sec
Kimi K2 via Groq Premium
209 tokens/sec
2000 character limit reached

Professional Agents -- Evolving Large Language Models into Autonomous Experts with Human-Level Competencies (2402.03628v1)

Published 6 Feb 2024 in cs.CL

Abstract: The advent of LLMs such as ChatGPT, PaLM, and GPT-4 has catalyzed remarkable advances in natural language processing, demonstrating human-like language fluency and reasoning capacities. This position paper introduces the concept of Professional Agents (PAgents), an application framework harnessing LLM capabilities to create autonomous agents with controllable, specialized, interactive, and professional-level competencies. We posit that PAgents can reshape professional services through continuously developed expertise. Our proposed PAgents framework entails a tri-layered architecture for genesis, evolution, and synergy: a base tool layer, a middle agent layer, and a top synergy layer. This paper aims to spur discourse on promising real-world applications of LLMs. We argue the increasing sophistication and integration of PAgents could lead to AI systems exhibiting professional mastery over complex domains, serving critical needs, and potentially achieving artificial general intelligence.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (117)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
  2. Towards a human-like open-domain chatbot. arXiv preprint arXiv:2001.09977, 2020.
  3. Rest meets react: Self-improvement for multi-step reasoning LLM agent. CoRR, abs/2312.10003, 2023.
  4. Flamingo: a visual language model for few-shot learning. Advances in Neural Information Processing Systems, 35:23716–23736, 2022.
  5. A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. arXiv preprint arXiv:2302.04023, 2023.
  6. Learning from richer human guidance: Augmenting comparison-based learning with feature queries. In Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction, pp.  132–140, 2018.
  7. Graph of thoughts: Solving elaborate problems with large language models. arXiv preprint arXiv:2308.09687, 2023.
  8. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020.
  9. Autoagents: A framework for automatic agent generation. ArXiv, abs/2309.17288, 2023a.
  10. Autoagents: A framework for automatic agent generation. arXiv preprint arXiv:2309.17288, 2023b.
  11. Visualgpt: Data-efficient image captioning by balancing visual input and linguistic knowledge from pretraining. 2021.
  12. Towards end-to-end embodied decision making via multi-modal large language model: Explorations with gpt4-vision and beyond. CoRR, abs/2310.02071, 2023c.
  13. Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors in agents. arXiv preprint arXiv:2308.10848, 2023d.
  14. Palm: Scaling language modeling with pathways. J. Mach. Learn. Res., 24:240:1–240:113, 2023.
  15. Deep reinforcement learning from human preferences. Advances in neural information processing systems, 30, 2017.
  16. Graph infomax adversarial learning for treatment effect estimation with networked observational data. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp.  176–184, 2021.
  17. Data-centric financial large language models. CoRR, abs/2310.17784, 2023a.
  18. Leveraging large language models for pre-trained recommender systems. CoRR, abs/2308.10837, 2023b.
  19. Task-driven causal feature distillation: Towards trustworthy risk prediction. arXiv preprint arXiv:2312.16113, 2023c.
  20. Llm-guided multi-view hypergraph learning for human-centric explainable recommendation. CoRR, abs/2401.08217, 2024.
  21. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416, 2022.
  22. Dynamic planning with a llm. arXiv preprint arXiv:2308.06391, 2023.
  23. Explicable reward design for reinforcement learning agents. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, pp.  20118–20131, 2021.
  24. Multi-agent systems: A survey. IEEE Access, 6:28573–28593, 2018.
  25. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  26. Palm-e: An embodied multimodal language model. arXiv preprint arXiv:2303.03378, 2023.
  27. Fischer, K. A. Reflective linguistic programming (rlp): A stepping stone in socially-aware agi (socialagi). arXiv preprint arXiv:2305.12647, 2023.
  28. CRITIC: large language models can self-correct with tool-interactive critiquing. CoRR, abs/2305.11738, 2023.
  29. Intelligent virtual assistants with llm-based process automation. CoRR, abs/2312.06677, 2023.
  30. Fair attribute completion on graph with missing attributes. arXiv preprint arXiv:2302.12977, 2023.
  31. Large language model based multi-agents: A survey of progress and challenges. 2024.
  32. Metagpt: Meta programming for multi-agent collaborative framework. arXiv preprint arXiv:2308.00352, 2023a.
  33. Metagpt: Meta programming for multi-agent collaborative framework. ArXiv, abs/2308.00352, 2023b.
  34. Chatdb: Augmenting llms with databases as their symbolic memory. arXiv preprint arXiv:2306.03901, 2023.
  35. Audiogpt: Understanding and generating speech, music, sound, and talking head. arXiv preprint arXiv:2304.12995, 2023.
  36. Inner monologue: Embodied reasoning through planning with language models. arXiv preprint arXiv:2207.05608, 2022.
  37. Parrot: Translating during chat using large language models tuned with human translation and feedback. In Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023, pp.  15009–15020, 2023.
  38. Time-llm: Time series forecasting by reprogramming large language models. arXiv preprint arXiv:2310.01728, 2023.
  39. Conditional variational autoencoder with adversarial learning for end-to-end text-to-speech. In International Conference on Machine Learning, pp.  5530–5540. PMLR, 2021.
  40. The past, present and better future of feedback learning in large language models for subjective human preferences and values. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, pp.  2409–2430, 2023.
  41. Large language models are zero-shot reasoners, 2022. URL https://arxiv. org/abs/2205.11916.
  42. RLAIF: scaling reinforcement learning from human feedback with AI feedback. CoRR, abs/2309.00267, 2023.
  43. Modelscope-agent: Building your customizable agent system with open-source large language models. arXiv preprint arXiv:2309.00986, 2023a.
  44. Api-bank: A benchmark for tool-augmented llms. arXiv preprint arXiv:2304.08244, 2023b.
  45. Machine Learning for Causal Inference. Springer Nature, 2023.
  46. Inferring rewards from language in context. arXiv preprint arXiv:2204.02515, 2022.
  47. Llm+ p: Empowering large language models with optimal planning proficiency. arXiv preprint arXiv:2304.11477, 2023a.
  48. Molxpt: Wrapping molecules with text for generative pre-training. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), ACL 2023, Toronto, Canada, pp.  1606–1616, 2023b.
  49. Dynamic llm-agent network: An llm-agent collaboration framework with agent team optimization. ArXiv, abs/2310.02170, 2023c.
  50. Improving contextual language models for response retrieval in multi-turn conversation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.  1805–1808, 2020.
  51. Memory-assisted prompt editing to improve gpt-3 after deployment. arXiv preprint arXiv:2201.06009, 2022.
  52. Self-refine: Iterative refinement with self-feedback. arXiv preprint arXiv:2303.17651, 2023.
  53. Mobilevit: light-weight, general-purpose, and mobile-friendly vision transformer. arXiv preprint arXiv:2110.02178, 2021.
  54. Selfcheck: Using llms to zero-shot check their own step-by-step reasoning. arXiv preprint arXiv:2308.00436, 2023.
  55. Ret-llm: Towards a general read-write memory for large language models. arXiv preprint arXiv:2305.14322, 2023.
  56. Self-adaptive large language model (llm)-based multiagent systems. 2023 IEEE International Conference on Autonomic Computing and Self-Organizing Systems Companion (ACSOS-C), pp.  104–109, 2023.
  57. Progen2: Exploring the boundaries of protein language models. CoRR, abs/2206.13517, 2022.
  58. Extending cognitive architecture with episodic memory. In AAAI, pp.  1560–1564, 2007.
  59. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, 2022.
  60. Autoplan: Automatic planning of interactive decision-making tasks with large language models. In Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023, pp.  3114–3128, 2023.
  61. Language model self-improvement by reinforcement learning contemplation. CoRR, abs/2305.14483, 2023.
  62. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pp.  1–22, 2023.
  63. Self-driven grounding: Large language model agents with automatical language-aligned skill learning. CoRR, abs/2309.01352, 2023.
  64. Communicative agents for software development. arXiv preprint arXiv:2307.07924, 2023.
  65. Toolllm: Facilitating large language models to master 16000+ real-world apis. arXiv preprint arXiv:2307.16789, 2023.
  66. Sayplan: Grounding large language models using 3d scene graphs for scalable task planning. arXiv preprint arXiv:2307.06135, 2023.
  67. Fastspeech: Fast, robust and controllable text to speech. Advances in neural information processing systems, 32, 2019.
  68. Efficient RLHF: reducing the memory usage of PPO. CoRR, abs/2309.00754, 2023.
  69. Toolformer: Language models can teach themselves to use tools. arXiv preprint arXiv:2302.04761, 2023.
  70. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  71. Schuurmans, D. Memory augmented large language models are computationally universal. arXiv preprint arXiv:2301.04589, 2023.
  72. Algorithm of thoughts: Enhancing exploration of ideas in large language models. arXiv preprint arXiv:2308.10379, 2023.
  73. Generative deep neural networks for dialogue: A short review. arXiv preprint arXiv:1611.06216, 2016.
  74. Clever hans or neural theory of mind? stress testing social reasoning in large language models. arXiv preprint arXiv:2305.14763, 2023.
  75. Reflexion: Language agents with verbal reinforcement learning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  76. Llm-planner: Few-shot grounded planning for embodied agents with large language models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  2998–3009, 2023.
  77. Learning rewards from linguistic feedback. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp.  6002–6010, 2021.
  78. Cognitive architectures for language agents. ArXiv, abs/2309.02427, 2023.
  79. Is chatgpt good at search? investigating large language models as re-ranking agents. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, pp.  14918–14937, 2023.
  80. Better language models of code through self-improvement. In Rogers, A., Boyd-Graber, J. L., and Okazaki, N. (eds.), Findings of the Association for Computational Linguistics: ACL 2023, Toronto, Canada, pp.  12994–13002. Association for Computational Linguistics, 2023.
  81. Mlp-mixer: An all-mlp architecture for vision. Advances in neural information processing systems, 34:24261–24272, 2021.
  82. Training data-efficient image transformers & distillation through attention. In International conference on machine learning, pp.  10347–10357. PMLR, 2021.
  83. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
  84. Neural discrete representation learning. Advances in neural information processing systems, 30, 2017.
  85. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  86. Voyager: An open-ended embodied agent with large language models. arXiv preprint arXiv:2305.16291, 2023a.
  87. Adapting llm agents through communication. ArXiv, abs/2310.01444, 2023b.
  88. A survey on large language model based autonomous agents. ArXiv, abs/2308.11432, 2023c.
  89. A survey on large language model based autonomous agents. arXiv preprint arXiv:2308.11432, 2023d.
  90. SMILES-BERT: large scale unsupervised pre-training for molecular property prediction. In Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2019, Niagara Falls, NY, USA, pp.  429–436. ACM, 2019.
  91. Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171, 2022.
  92. Enhancing recommender systems with large language model reasoning graphs. CoRR, abs/2308.10835, 2023e.
  93. Describe, explain, plan and select: Interactive planning with large language models enables open-world multi-task agents. arXiv preprint arXiv:2302.01560, 2023f.
  94. Unleashing cognitive synergy in large language models: A task solving agent through multi-persona self-collaboration. CoRR, abs/2307.05300, 2023g.
  95. Multi-level protein structure pre-training via prompt learning. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda. OpenReview.net, 2023h.
  96. Tf-gridnet: Integrating full-and sub-band modeling for speech separation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023i.
  97. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837, 2022.
  98. Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100, 2022.
  99. Autogen: Enabling next-gen llm applications via multi-agent conversation framework. arXiv preprint arXiv:2308.08155, 2023.
  100. The rise and potential of large language model based agents: A survey. arXiv preprint arXiv:2309.07864, 2023.
  101. Openagents: An open platform for language agents in the wild. arXiv preprint arXiv:2310.10634, 2023.
  102. Rewoo: Decoupling reasoning from observations for efficient augmented language models. arXiv preprint arXiv:2305.18323, 2023.
  103. Prompt-augmented temporal point process for streaming event sequence. arXiv preprint arXiv:2310.04993, 2023a.
  104. Weaverbird: Empowering financial decision-making with large language model, knowledge base, and search engine. CoRR, abs/2308.05361, 2023b.
  105. Learning to incentivize other learning agents. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, 2020.
  106. Appagent: Multimodal agents as smartphone users. arXiv preprint arXiv:2312.13771, 2023.
  107. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629, 2022.
  108. Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601, 2023.
  109. Syntaspeech: syntax-aware generative adversarial text-to-speech. arXiv preprint arXiv:2204.11792, 2022.
  110. Self-rewarding language models. CoRR, abs/2401.10020, 2024.
  111. Proagent: Building proactive cooperative agents with large language models. 2023a.
  112. Building cooperative embodied agents modularly with large language models. arXiv preprint arXiv:2307.02485, 2023b.
  113. Prediction of multiple types of RNA modifications via biological language model. IEEE ACM Trans. Comput. Biol. Bioinform., 20(5):3205–3214, 2023c.
  114. Memorybank: Enhancing large language models with long-term memory. arXiv preprint arXiv:2305.10250, 2023.
  115. Uni-mol: A universal 3d molecular representation learning framework. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda. OpenReview.net, 2023.
  116. Trustworthy representation learning across domains. arXiv preprint arXiv:2308.12315, 2023a.
  117. Ghost in the minecraft: Generally capable agents for open-world enviroments via large language models with text-based knowledge and memory. arXiv preprint arXiv:2305.17144, 2023b.
Citations (5)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.