From Language Models to Practical Self-Improving Computer Agents (2404.11964v1)
Abstract: We develop a simple and straightforward methodology to create AI computer agents that can carry out diverse computer tasks and self-improve by developing tools and augmentations to enable themselves to solve increasingly complex tasks. As LLMs have been shown to benefit from non-parametric augmentations, a significant body of recent work has focused on developing software that augments LLMs with various capabilities. Rather than manually developing static software to augment LLMs through human engineering effort, we propose that an LLM agent can systematically generate software to augment itself. We show, through a few case studies, that a minimal querying loop with appropriate prompt engineering allows an LLM to generate and use various augmentations, freely extending its own capabilities to carry out real-world computer tasks. Starting with only terminal access, we prompt an LLM agent to augment itself with retrieval, internet search, web navigation, and text editor capabilities. The agent effectively uses these various tools to solve problems including automated software development and web-based tasks.
- Seeker: Real-time interactive search, 2019.
- On the opportunities and risks of foundation models, 2022.
- Language models are few-shot learners. arXiV, 2020. URL https://arxiv.org/abs/2005.14165.
- Evaluating large language models trained on code. CoRR, abs/2107.03374, 2021. URL https://arxiv.org/abs/2107.03374.
- Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks, 2023.
- Pal: Program-aided language models, 2023.
- Neural turing machines, 2014.
- Realm: Retrieval-augmented language model pre-training, 2020.
- Rethinking with retrieval: Faithful large language model inference, 2022.
- Scaling laws for neural language models. arXiV, 2020. URL https://arxiv.org/abs/2001.08361.
- Generalization through memorization: Nearest neighbor language models, 2020.
- Internet-augmented language models through few-shot prompting for open-domain question answering, 2022.
- Retrieval-augmented generation for knowledge-intensive nlp tasks, 2021.
- Coderl: Mastering code generation through pretrained models and deep reinforcement learning. arXiV, 2022. URL https://arxiv.org/pdf/2207.01780.
- Pre-trained language models for interactive decision-making, 2022a.
- Competition-level code generation with alphacode, 2022b. URL https://arxiv.org/abs/2203.07814.
- Webgpt: Browser-assisted question-answering with human feedback, 2022.
- Gpt-4 technical report, 2024.
- One-shot learning with memory-augmented neural networks. CoRR, abs/1605.06065, 2016. URL http://arxiv.org/abs/1605.06065.
- Toolformer: Language models can teach themselves to use tools, 2023.
- Juergen Schmidhuber. Goedel machines: Self-referential universal problem solvers making provably optimal self-improvements. Lecture Notes in Computer Science - Adaptive Agents and Multi-Agent Systems II, 3394, 2006. URL https://arxiv.org/abs/cs/0309048.
- Jurgen Schmidhuber. Evolutionary principles in self-referential learning. on learning now to learn: The meta-meta-meta…-hook. Diploma thesis, Technische Universitat Munchen, Germany, 14 May 1987. URL http://www.idsia.ch/~juergen/diploma.html.
- Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face, 2023.
- Reflexion: Language agents with verbal reinforcement learning, 2023.
- Blenderbot 3: a deployed conversational agent that continually learns to responsibly engage, 2022.
- Lamda: Language models for dialog applications, 2022.
- Interleaving retrieval with chain-of-thought reasoning for knowledge-intensive multi-step questions, 2023.
- Attention is all you need. CoRR, abs/1706.03762, 2017. URL http://arxiv.org/abs/1706.03762.
- Finetuned language models are zero-shot learners, 2022.
- Chain-of-thought prompting elicits reasoning in large language models, 2023.
- Intercode: Standardizing and benchmarking interactive coding with execution feedback, 2023.
- Webshop: Towards scalable real-world web interaction with grounded language agents, 2023a.
- React: Synergizing reasoning and acting in language models, 2023b.
- Self-taught optimizer (stop): Recursively self-improving code generation, 2024.
- Alex Sheng (4 papers)