Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
72 tokens/sec
GPT-4o
61 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AIOS: LLM Agent Operating System (2403.16971v3)

Published 25 Mar 2024 in cs.OS, cs.AI, and cs.CL
AIOS: LLM Agent Operating System

Abstract: LLM-based intelligent agents face significant deployment challenges, particularly related to resource management. Allowing unrestricted access to LLM or tool resources can lead to inefficient or even potentially harmful resource allocation and utilization for agents. Furthermore, the absence of proper scheduling and resource management mechanisms in current agent designs hinders concurrent processing and limits overall system efficiency. As the diversity and complexity of agents continue to grow, addressing these resource management issues becomes increasingly critical to LLM-based agent systems. To address these challenges, this paper proposes the architecture of AIOS (LLM-based AI Agent Operating System) under the context of managing LLM-based agents. It introduces a novel architecture for serving LLM-based agents by isolating resources and LLM-specific services from agent applications into an AIOS kernel. This AIOS kernel provides fundamental services (e.g., scheduling, context management, memory management, storage management, access control) and efficient management of resources (e.g., LLM and external tools) for runtime agents. To enhance usability, AIOS also includes an AIOS-Agent SDK, a comprehensive suite of APIs designed for utilizing functionalities provided by the AIOS kernel. Experimental results demonstrate that using AIOS can achieve up to 2.1x faster execution for serving agents built by various agent frameworks. The source code is available at https://github.com/agiresearch/AIOS.

Integrating LLMs into Operating Systems with AIOS

Overview of AIOS

The deployment and scaling of LLM-based intelligent agents within existing operating system (OS) frameworks present significant challenges, including inefficient scheduling, complex integration of heterogeneous agents, and sub-optimal resource allocation. The "LLM Agent Operating System" (AIOS) paper presents a novel approach to embedding LLMs into operating systems to address these issues. AIOS optimizes resource allocation, enables concurrent execution of agents, facilitates context switching, and provides essential tool services for agents, thereby improving both the performance and the efficiency of LLM agents.

AIOS Architecture

AIOS is structured into three distinctive layers: application, kernel, and hardware layers, each serving a specific function in the overall system. The application layer hosts the agent applications and leverages the AIOS SDK for development. The kernel layer, consisting of OS Kernel and LLM Kernel, orchestrates the scheduling, context management, memory management, tool management, and access control functions specific to LLM operations. The hardware layer provides the fundamental computing resources but is interacted with indirectly through the system calls to ensure security and abstraction.

Core Modules and Functionalities

The heart of AIOS lies in its LLM Kernel, which harbors several crucial modules:

  • Agent Scheduler: Implements scheduling algorithms to optimize LLM utilization and balance agent request processing.
  • Context Manager: Supports intermediate generation status snapshotting and context window management, enabling paused responses to be continued.
  • Memory and Storage Managers: Provide short-term and long-term data management solutions for handling interaction logs and agent data.
  • Tool Manager: Manages a suite of external API tools that agents can call for performing specific tasks.
  • Access Manager: Enforces privacy policies and access control measures to maintain data integrity and confidentiality within the multi-agent system.

LLM System Calls and AIOS SDK

AIOS introduces LLM system calls, which serve as intermediary functions facilitating the interaction between agent requests and the execution of kernel modules. To simplify development within AIOS, an SDK is provided, encapsulating these system calls and offering a higher abstraction level for agent developers. This SDK streamlines the creation, deployment, and management of LLM-based agents.

Evaluation and Results

The paper's evaluation of AIOS focuses on the consistency of agent outputs after temporary suspension and the performance of its scheduling mechanism. Utilizing BLEU and BERT scores for consistency measurement, and employing waiting and turnaround time as metrics for scheduling performance, the results substantiate AIOS's ability to maintain output consistency across multi-agent operations and demonstrate its scheduling algorithm's effectiveness in optimizing resource utilization and reducing processing delays.

Implications and Future Directions

The introduction of AIOS pioneers an advanced platform for the integration and efficient management of LLM-based agents within OS frameworks. Beyond immediate performance improvements, AIOS opens pathways for further research, including advanced scheduling algorithms, enhancements in memory and storage architectures, and robust safety and privacy enhancements. These future directions promise to elevate the capabilities of AIOS, driving forward the development and widespread application of intelligent agents across various domains.

AIOS not only addresses existing challenges in deploying LLM agents but also sets a precedent for future research and development in the convergence of artificial intelligence and operating system design. Through its holistic architecture and modular design, AIOS facilitates the scalable, secure, and efficient deployment of LLM agents, marking a significant stride towards realizing the full potential of LLM integration within computing environments.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (68)
  1. Intelligent agents: Theory and practice. The knowledge engineering review, 10(2):115–152, 1995.
  2. A roadmap of agent research and development. Autonomous agents and multi-agent systems, 1:7–38, 1998.
  3. Tropos: An agent-oriented software development methodology. Autonomous Agents and Multi-Agent Systems, 8:203–236, 2004.
  4. OpenAI. Gpt-4. https://openai.com/research/gpt-4, 2023.
  5. Facebook. Meta. introducing llama: A foundational, 65-billion-parameter large language model. https://ai.facebook.com/blog/largelanguage-model-llama-meta-ai, 2022.
  6. Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805, 2023.
  7. OpenAGI: When LLM Meets Domain Experts. Advances in Neural Information Processing Systems, 36, 2023.
  8. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
  9. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416, 2022.
  10. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
  11. Recommendation as language processing (rlp): A unified pretrain, personalized prompt & predict paradigm (p5). In Proceedings of the 16th ACM Conference on Recommender Systems, page 299–315, 2022.
  12. Large language models are zero-shot reasoners. Advances in neural information processing systems, 35:22199–22213, 2022.
  13. Codegen: An open large language model for code with multi-turn program synthesis. arXiv preprint arXiv:2203.13474, 2022.
  14. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085, 2022.
  15. Reasoning with language model is planning with world model. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 8154–8173, 2023.
  16. Language models can solve computer tasks. Advances in Neural Information Processing Systems, 36, 2023.
  17. The programmer’s assistant: Conversational interaction with a large language model for software development. In Proceedings of the 28th International Conference on Intelligent User Interfaces, pages 491–514, 2023.
  18. Palm-e: an embodied multimodal language model. In Proceedings of the 40th International Conference on Machine Learning, pages 8469–8488, 2023.
  19. Do as i can, not as i say: Grounding language in robotic affordances. In Conference on robot learning, pages 287–318. PMLR, 2023.
  20. ReAct: Synergizing reasoning and acting in language models. International Conference on Learning Representations, 2023.
  21. Reflexion: Language agents with verbal reinforcement learning. Advances in Neural Information Processing Systems, 36, 2023.
  22. Mind2web: Towards a generalist agent for the web. Advances in Neural Information Processing Systems, 36, 2023.
  23. UW:CSE451. History of Operating Systems, 2023. https://courses.cs.washington.edu/courses/cse451/16wi/readings/lecture_readings/LCM_OperatingSystemsTimeline_Color_acd_newsize.pdf.
  24. The unix time-sharing system. Commun. ACM, 17(7):365–375, jul 1974.
  25. Charles Antony Richard Hoare. Monitors: An operating system structuring concept. Communications of the ACM, 17(10):549–557, 1974.
  26. Exokernel: An operating system architecture for application-level resource management. ACM SIGOPS Operating Systems Review, 29(5):251–266, 1995.
  27. Scheduling algorithms for multiprogramming in a hard-real-time environment. Journal of the ACM (JACM), 20(1):46–61, 1973.
  28. Edsger W Dijkstra. Cooperating sequential processes. In The origin of concurrent programming: from semaphores to remote procedure calls, pages 65–138. Springer, 2002.
  29. Peter J Denning. The working set model for program behavior. Communications of the ACM, 11(5):323–333, 1968.
  30. Virtual memory, processes, and sharing in multics. Communications of the ACM, 11(5):306–312, 1968.
  31. The design and implementation of a log-structured file system. ACM Transactions on Computer Systems (TOCS), 10(1):26–52, 1992.
  32. A fast file system for unix. ACM Transactions on Computer Systems (TOCS), 2(3):181–197, 1984.
  33. LLM as OS, Agents as Apps: Envisioning AIOS, Agents and the AIOS-Agent Ecosystem. arXiv:2312.03815, 2023.
  34. Toolformer: Language models can teach themselves to use tools. arXiv preprint arXiv:2302.04761, 2023.
  35. Language agents in the digital world: Opportunities and risks. princeton-nlp.github.io, Jul 2023.
  36. Talm: Tool augmented language models. arXiv preprint arXiv:2205.12255, 2022.
  37. Toolalpaca: Generalized tool learning for language models with 3000 simulated cases. arXiv preprint arXiv:2306.05301, 2023.
  38. Webgpt: Browser-assisted question-answering with human feedback, 2022.
  39. Toolcoder: Teach code generation models to use apis with search tools. arXiv preprint arXiv:2305.04032, 2023.
  40. Minedojo: Building open-ended embodied agents with internet-scale knowledge. Advances in Neural Information Processing Systems, 35:18343–18362, 2022.
  41. Voyager: An open-ended embodied agent with large language models. In Intrinsically-Motivated and Open-Ended Learning Workshop@ NeurIPS2023, 2023.
  42. Emergent autonomous scientific research capabilities of large language models. arXiv preprint arXiv:2304.05332, 2023.
  43. Chemcrow: Augmenting large-language models with chemistry tools. arXiv preprint arXiv:2304.05376, 2023.
  44. Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In International Conference on Machine Learning, pages 9118–9147. PMLR, 2022.
  45. Language models meet world models: Embodied experiences enhance language models. Advances in neural information processing systems, 36, 2023.
  46. Camel: Communicative agents for "mind" exploration of large language model society. Advances in Neural Information Processing Systems, 36, 2023.
  47. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pages 1–22, 2023.
  48. Metagpt: Meta programming for multi-agent collaborative framework. In The Twelfth International Conference on Learning Representations, 2023.
  49. Communicative agents for software development. arXiv preprint arXiv:2307.07924, 2023.
  50. Autogen: Enabling next-gen llm applications via multi-agent conversation framework. arXiv preprint arXiv:2308.08155, 2023.
  51. Flows: Building blocks of reasoning and collaborating ai. arXiv preprint arXiv:2308.01285, 2023.
  52. Improving language model negotiation with self-play and in-context learning from ai feedback. arXiv preprint arXiv:2305.10142, 2023.
  53. Improving factuality and reasoning in language models through multiagent debate. arXiv preprint arXiv:2305.14325, 2023.
  54. Chateval: Towards better llm-based evaluators through multi-agent debate. In The Twelfth International Conference on Learning Representations, 2023.
  55. Encouraging divergent thinking in large language models through multi-agent debate. arXiv preprint arXiv:2305.19118, 2023.
  56. War and peace (waragent): Large language model-based multi-agent simulation of world wars. arXiv preprint arXiv:2311.17227, 2023.
  57. Mistral 7b. arXiv preprint arXiv:2310.06825, 2023.
  58. Pythia: A suite for analyzing large language models across training and scaling. In International Conference on Machine Learning, pages 2397–2430. PMLR, 2023.
  59. Extending context window of large language models via positional interpolation. arXiv preprint arXiv:2306.15595, 2023.
  60. Yarn: Efficient context window extension of large language models. arXiv preprint arXiv:2309.00071, 2023.
  61. Augmented language models: a survey. Transactions on Machine Learning Research, 2023.
  62. LangChain. Langchain. https://github.com/langchain-ai/langchain, 2024.
  63. Rapid. Rapid api hub. https://rapidapi.com/hub, 2024.
  64. Ken Thompson. Reflections on trusting trust. Communications of the ACM, 27(8):761–763, 1984.
  65. Towards taming privilege-escalation attacks on android. In NDSS, volume 17, page 19, 2012.
  66. Tris Warkentin Jeanine Banks. Gemma: Introducing new state-of-the-art open models. https://blog.google/technology/developers/gemma-open-models/, 2024.
  67. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318, 2002.
  68. Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675, 2019.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Kai Mei (30 papers)
  2. Zelong Li (24 papers)
  3. Shuyuan Xu (31 papers)
  4. Ruosong Ye (4 papers)
  5. Yingqiang Ge (36 papers)
  6. Yongfeng Zhang (163 papers)
  7. Xi Zhu (35 papers)
  8. Wujiang Xu (19 papers)
  9. Wenyue Hua (51 papers)
  10. Mingyu Jin (38 papers)
Citations (8)
Youtube Logo Streamline Icon: https://streamlinehq.com