- The paper presents a multi-agent framework that integrates level-specific LLM agents with compiler passes to achieve nontrivial runtime speedups.
- It employs dynamic test generation and a supervisory agent to ensure functional correctness while delivering up to 1.25× performance improvements.
- The framework unifies semantic reasoning of LLMs with classical optimizations, offering a scalable path toward AI-guided performance engineering.
Agentic Compiler-LLM Cooperation for Code Optimization
Introduction
"Agentic Code Optimization via Compiler-LLM Cooperation" (2604.04238) addresses persistent gaps in the code optimization pipeline by integrating LLM agents with traditional compiler technology. While standard compilers execute deterministic optimization passes at decreasing levels of abstraction, they can fail to capture global program semantics that inform high-value performance transformations. Standalone LLM-based systems, by contrast, can express high-level structural and algorithmic code changes but frequently introduce correctness regressions. This paper formalizes and implements a distributed multi-agent framework that leverages the strengths of both LLMs and classical compilers, aiming to preserve correctness while maximizing optimization opportunities through agent cooperation across abstraction layers.
Methodology
The proposed framework introduces a multi-agent optimization pipeline, wherein LLM agents and compiler passes cooperate at different levels of program representation (source, intermediate representation, and low-level code). Key system components include:
- Level-specific LLM optimization agents: Each agent targets a distinct abstraction (e.g., source-level refactoring, IR-level rewrites) and proposes optimization candidates.
- Compiler passes as tools: Traditional optimization pipelines (e.g., GCC, LLVM passes) are programmatically invoked as callable tools.
- LLM-based test generation agent: Dynamically synthesizes test cases to establish both functional correctness and performance of candidate optimizations.
- Supervisory agent: A higher-order LLM agent orchestrates budget allocation, controls the invocation order of subagents, adjudicates between optimization strategies, and integrates test feedback.
A computational budget allocation scheme is designed to balance exploration across abstraction layers, trading off the combinatorial space of agent-invoked transformations against resource constraints.
Evaluation and Results
The system is benchmarked against state-of-the-art compiler-only optimization pipelines and LLM-only approaches restricted to a single level of code abstraction. The proposed cooperative framework consistently demonstrates superior real-world performance improvements. Notably, it delivers speedups of up to 1.25× on a representative set of programming benchmarks. This performance gain is achieved without sacrificing functional correctness, as measured by dynamic test generation and validation. The framework effectively unifies the semantic reasoning capabilities of LLMs with the reliability guarantees of industrial compilers.
Strong empirical evidence supports the central claim that LLM-compiler cooperation outperforms both independent classical and LLM baselines. Single-level LLM agents often overfit to surface-level patterns, while agentic cooperation—guided by dynamic test feedback—yields optimizations with nontrivial improvements to runtime efficiency.
Implications and Prospects
Practically, this system offers a viable path for deploying LLMs in performance-critical production compilers by bounding the correctness risks that have previously impeded their adoption. The architecture is extensible to heterogeneous toolchains and can incorporate advances in test synthesis, fuzzing, and formal verification. Theoretically, agent orchestration at multiple abstraction levels suggests promising avenues for automated discovery of non-local program transformations.
This cooperative architecture could scale with advances in LLM reasoning capabilities, more sophisticated agent tool-use (Johnson et al., 16 Oct 2025), grammar-aligned decoding for syntactic and semantic control (Park et al., 2024), and RL-based optimization feedback (Gehring et al., 2024). Integration with continuous integration (CI) systems for automatic regression testing and with auto-tuning pipes for hardware-specific optimizations is a logical next step. As LLMs improve in abstract program synthesis and verification, agentic frameworks could enable routine end-to-end automated performance engineering.
Conclusion
This work establishes a robust paradigm for program optimization—agentic compiler-LLM cooperation—by combining the correctness of traditional compilers with the flexible program analysis capabilities of LLM agents operating across multiple code abstractions. By demonstrating nontrivial speedup gains while ensuring program correctness, the paper substantiates agentic cooperation as the most effective current method for practical code optimization pipelines. The results indicate significant potential for AI-guided compilers with distributed agent architectures, with implications for software performance engineering and automated system design.