Towards Agentic OS: An LLM Agent Framework for Linux Schedulers (2509.01245v2)

Published 1 Sep 2025 in cs.AI, cs.MA, and cs.OS

Abstract: Operating system schedulers suffer from a fundamental semantic gap, where kernel policies fail to understand application-specific needs, leading to suboptimal performance. We introduce SchedCP, the first framework that enables fully autonomous LLM agents to safely and efficiently optimize Linux schedulers without human involvement. Our core insight is that the challenge is not merely to apply a better LLM, but to architect a decoupled control plane that separates the AI's role of semantic reasoning ("what to optimize") from the system's role of execution ("how to observe and act"). Implemented as Model Context Protocol(MCP) server, SchedCP provides a stable interface with three key services: a Workload Analysis Engine, an evolving Scheduler Policy Repository, and an Execution Verifier that validates all AI-generated code and configure before deployment with static and dynamic analysis. We demonstrate this architecture's power with sched-agent, a multi-agent system that autonomously analyzes workloads, synthesizes custom eBPF scheduling policies, and deploys them via the sched_ext infrastructure. Our evaluation shows that SchedCP achieves up to an 1.79x performance improvement, and a 13x cost reduction compared to naive agentic approaches, all while maintaining high success rate. By bridging the semantic gap, SchedCP democratizes expert-level system optimization and represents a step towards creating truly self-optimizing, application-aware operating systems. The code is open-sourced in https://github.com/eunomia-bpf/schedcp

Summary

The paper introduces SchedCP, an LLM agent framework that decouples semantic reasoning from execution to optimize Linux schedulers via eBPF.
Experimental results demonstrate up to 1.79x kernel compilation speedup, improved latency by 2.11x, and significant throughput gains.
The modular, multi-agent design with robust multi-stage verification ensures safe, efficient, and scalable scheduler policy synthesis.

Agentic OS Optimization: Autonomous LLM Agents for Linux Scheduler Synthesis

Introduction and Motivation

The paper presents SchedCP, a modular control plane framework enabling fully autonomous LLM agents to optimize Linux schedulers via eBPF, addressing the persistent semantic gap between kernel policies and application-specific requirements. Traditional scheduler policies, such as EEVDF, are agnostic to workload semantics, resulting in suboptimal performance, especially in heterogeneous and dynamic environments. Prior RL-based approaches are limited by their inability to generalize across workloads and their lack of semantic reasoning, while naive LLM agentic methods are prohibitively slow, expensive, and unsafe for kernel-level automation.

SchedCP is designed to decouple semantic reasoning ("what to optimize") from system execution ("how to observe and act"), providing a stable, future-proof interface for AI-driven OS optimization. The framework is complemented by sched-agent, a multi-agent system that leverages in-context RL and LLM-based reasoning to analyze workloads, synthesize custom scheduling policies, and deploy them safely via sched_ext.

SchedCP Framework Architecture

SchedCP is implemented as a Model Context Protocol (MCP) server, exposing three core services:

Workload Analysis Engine: Provides tiered access to system performance data, supporting adaptive context provisioning to balance cost and precision. Agents can query for high-level summaries, detailed profiling via eBPF probes, and post-deployment feedback metrics.
Scheduler Policy Repository: A vector database storing eBPF scheduler code, metadata, and historical performance metrics. It supports semantic search, code retrieval, and composable policy construction, facilitating reuse and continuous improvement.
Execution Verifier: A multi-stage validation pipeline combining the kernel's eBPF verifier (for memory safety and termination) with custom static analysis (for domain-specific invariants like fairness and liveness) and dynamic validation in a secure micro-VM. Canary deployment and circuit breaker mechanisms ensure system stability and rollback on performance degradation.

This architecture enforces strict separation of concerns, preventing agents from requiring root access and minimizing the risk of catastrophic failures. SchedCP is implemented in Rust and Python, emphasizing modularity and extensibility.

sched-agent: Multi-Agent LLM System

sched-agent operationalizes autonomous scheduler optimization through four specialized agents:

Observation Agent: Strategically queries the Workload Analysis Engine to construct a comprehensive workload profile, synthesizing natural language descriptions, quantified metrics, and explicit optimization goals. It supports real-time adaptation to changing workload patterns.
Planning Agent: Translates workload profiles into optimization strategies, leveraging semantic queries to the Scheduler Policy Repository. It hierarchically selects, configures, patches, or synthesizes new scheduler policies based on historical performance and code primitives.
Execution Agent: Manages code synthesis, validation, and deployment. It interacts with the Execution Verifier, refines code based on feedback, and orchestrates canary rollouts with automatic fallback.
Learning Agent: Completes the in-context RL loop by analyzing deployment outcomes, updating the repository with refined metrics, annotating successful policies, and documenting antipatterns for future avoidance.

The agents collaborate in a closed loop, enabling iterative refinement and continuous learning without requiring model retraining or repeated workload execution.

Experimental Evaluation

SchedCP and sched-agent were evaluated on two hardware platforms (86-core Xeon and 8-core Core Ultra) using kernel compilation, schbench, and diverse batch workloads. Key results include:

Kernel Compilation: Sched-agent achieved a 1.79x speedup over EEVDF after three optimization iterations, with initial selection (scx_rusty) providing 1.63x improvement. RL-based baselines showed no improvement due to lack of generalization.
Schbench: Iterative agentic optimization improved P99 latency by 2.11x and throughput by 1.60x compared to EEVDF, demonstrating effective learning from performance feedback.
Batch Workloads: For eight heterogeneous tasks, the agent synthesized a Longest Job First (LJF) scheduler, reducing end-to-end processing time by 20% on average. Generation cost per workload dropped from $6 to$0.15, and time from 33 to 2.5 minutes (13x reduction).

Notably, the framework maintained high system stability, with all agent-generated schedulers passing multi-stage verification and canary deployment. The cost and efficiency improvements are substantial, making custom scheduler synthesis viable for short-lived workloads.

Implementation and Deployment Considerations

SchedCP's decoupled architecture ensures compatibility with future AI agents and kernel evolutions. The use of eBPF and sched_ext enables dynamic scheduler loading with zero LLM inference overhead in the scheduling hot path. The multi-stage verification pipeline is critical for safe deployment, addressing both general and domain-specific correctness properties.

Integration with container orchestrators (Kubernetes, Docker) allows automatic triggering of optimization cycles for new applications, supporting both cloud and edge scenarios. The composable tool architecture aligns with the Unix philosophy, enabling agents to construct novel workflows and solutions.

Resource requirements are moderate, with the framework tested on both high-end and commodity hardware. The use of adaptive context provisioning and semantic search minimizes token and API costs, supporting scalable deployment.

Implications and Future Directions

SchedCP demonstrates that autonomous LLM agents can safely and efficiently optimize kernel schedulers, bridging the semantic gap between application needs and system policies. The framework democratizes expert-level system optimization, making it accessible for both cloud and personal device users.

Future work includes extending the framework to other OS components (cache policies, DVFS, network configuration, sysctl), enabling cross-component optimization and unified control. Expressing inter-component dependencies and supporting adaptive, application-aware operating systems are promising directions. The approach also suggests new abstractions for safe, agentic systems integration in critical infrastructure.

Conclusion

SchedCP establishes a robust foundation for agentic OS optimization, enabling autonomous LLM agents to synthesize, validate, and deploy custom Linux schedulers via a decoupled control plane. The framework achieves significant performance and cost improvements while ensuring system stability, marking a substantive advance towards self-optimizing, application-aware operating systems. The open-source release facilitates further research and practical adoption in diverse computing environments.