AwareCompiler: Agentic Compiler Optimization

Updated 16 October 2025

AwareCompiler is an agentic compiler optimization framework that merges structured domain knowledge with large-scale empirical datasets to automate pass sequence synthesis while preserving correctness.
It employs a hybrid training pipeline using supervised fine-tuning and reinforcement learning with composite rewards to enhance performance efficiency and validity.
The framework effectively addresses semantic misalignment, interaction inefficiency, and reward sparsity, achieving significant code size reductions on industry benchmarks.

AwareCompiler is an agentic compiler optimization framework that synergistically combines structured compiler domain knowledge with large-scale empirical datasets to automate the synthesis of effective compiler optimization pass sequences while preserving correctness. The system’s design strategically addresses three key challenges in compiler optimization: semantic misalignment between program representations and pass interfaces, inefficiencies in agent–environment interactions due to large, sparse search spaces, and the reward sparsity that limits the effectiveness of sequential decision agents in this domain. Through its knowledge-driven adaptive pass generation and a hybrid supervised–reinforcement training pipeline, AwareCompiler achieves significant improvements over heuristic baselines and LLM-based software optimization agents in performance, validity, and efficiency of optimization sequences (Lin et al., 13 Oct 2025).

1. Framework Overview: Agentic, Knowledge-Data Driven Compiler Optimization

AwareCompiler’s architecture is characterized by the integration of a structured compiler knowledge base, a context-aware empirical dataset, and a context-sensitive optimization agent:

Structured Knowledge Base: Comprises empirical (historical mappings from program features to pass sequences with positive optimization effects), symbolic (pass dependency and conflict graphs capturing formal relationships among transformation passes), and negative (sequences known to induce regressions) knowledge. This knowledge base encodes both successful and detrimental optimization strategies, facilitating robust agentic reasoning.
Context-Aware Dataset: Built from program feature representations (AutoPhase-extracted features), it pairs expert-annotated optimal pass sequences with measured optimization gains, serving as the foundation for both supervised pre-training and policy evaluation.
Adaptive Pass Generation Module: Employs feature extraction and knowledge-guided retrieval to synthesize pass sequences as a contextual sequential decision policy, enforced by dependency/conflict constraints for semantic validity.
Hybrid Training Pipeline: Utilizes a two-stage approach: supervised fine-tuning (SFT) to bootstrap precise reasoning/format, followed by reinforcement learning (RL) with a composite reward that encompasses formal correctness, output validation, and performance improvement (e.g., LLVM IR code size reduction).

This composite system formalizes pass sequence synthesis as a policy πθ mapping code features to optimization actions, leveraging knowledge retrieval fused with empirical performance data to achieve robust optimization.

2. Addressed Challenges in Automated Compiler Optimization

AwareCompiler directly tackles several longstanding obstacles in agentic compiler optimization:

Semantic Misalignment: Abstract program representations (e.g., high-level IR or feature vectors) are often not aligned with concrete optimization pass semantics. By encoding explicit symbolic relations (e.g., dependency: “pass A must precede pass B”; conflict: “pass C inhibits pass D”) and empirical pass mappings, the system minimizes the risk of synthesizing syntactically valid but semantically incorrect pass orderings.
Interaction Inefficiency: Traditional ML-based or brute-force approaches require extensive agent-environment interaction, suffering from inefficiency and invalid action generation. AwareCompiler leverages both explicit knowledge retrieval to filter non-viable candidates and a data-driven policy refinement process, increasing the density of valid optimization sequences in the candidate set.
Reward Sparsity: Given the combinatorial pass space and long optimization horizons, sparse or non-informative rewards slow or destabilize sequential agent learning. The use of a composite reward function—measuring output format correctness, pass execution validity, and quantitative code improvement (ΔIC, the change in LLVM IR instruction count)—addresses this issue and facilitates effective policy convergence.

3. Structured Knowledge Integration and Adaptive Pass Synthesis

The knowledge-driven adaptive pass generation is central to the system’s efficacy:

Feature Extraction and Knowledge Fusion: Program x is mapped to feature vector z = ℱ(x), where each zᵢ is a code property. Candidate pass sequences π are ranked via similarity and relevance:

$\mathcal{R}(z, \mathcal{K}) = \text{Top-K} \{ \text{RankScore} \sum_{z_i \in z, \pi \in \mathcal{K}} [\mathrm{sim}(\varphi_i(z), \varphi(\pi))] \}$

Dependency and Conflict Constraints: The search for π* is constrained:

$\pi^* = \arg\min_{\pi \in \mathcal{R}(z, \mathcal{K})} \mathbb{E}[L_{\text{size}}(x, \pi)]$

Subject to: $\forall p_i, p_j \in \pi, \, p_i \in \text{deps}(p_j) \implies \text{pos}(p_i) < \text{pos}(p_j)$ and that no conflicting passes co-occur.

Thus, AwareCompiler’s generator can avoid pass-ordering regressions and invalid combinations, systematically selecting high-potential candidates.

4. Hybrid Training and Reward Design

Training proceeds via two phases:

Supervised Fine-Tuning (SFT): The agent is exposed to expert-annotated (input, output reasoning, pass sequence, code performance) tuples, learning the correct output structure and inferential reasoning steps for pass synthesis.
Reinforcement Learning (RL): With SFT providing initialization, RL fine-tuning uses:

$\pi^{RL} = \arg\max_{\pi} \mathbb{E}\left[\sum_t \gamma^t r_t\right]$

with $r_t$ decomposed into: - $R_{\text{format}}$ : Penalty for incorrect output structure - $R_{\text{answer}}$ : Validity of compiled output - $R_{\text{performance}} = \Delta IC$ : % decrease in LLVM IR count

This design ensures that the agent not only learns to generate code that is formally correct but also outputs practically effective optimizations.

5. Experimental Evaluation and Quantitative Outcomes

AwareCompiler was extensively evaluated using a suite of industry-relevant benchmarks, including blas, cbench, chstone, mibench, npb, opencv, and tensorflow. Noteworthy findings include:

Substantial Code Size Reductions: AwareCompiler-1.5B reduced LLVM IR instructions by 30.03% on average, matching or exceeding domain expert optimizations. A significant margin above heuristic (O1/O2/O3/Oz) and LLM-based baselines (e.g., GPT-5, Gemini-2.5) was observed.
High Success Rate in Valid Sequence Generation: Context- and knowledge-driven constraints elevated the rate of semantically and syntactically valid pass sequences, particularly on benchmarks such as CBench and CHSTONE.
Ablation and Reward Impact: Both knowledge and data-driven components are indispensable. Removing either one led to sharp declines in performance, as did eliminating reward terms (particularly the format and answer validation components).

6. Practical Implications and Scalability

Several practical benefits emerge from AwareCompiler’s approach:

Reduced Manual Tuning: By automatically generating valid and high-performing optimization pass sequences, AwareCompiler lifts much of the manual profiling and tuning burden from compiler end-users.
Enhanced Robustness and Portability: The agent’s decisions, being informed by both positive and negative historical outcomes, are less prone to induce performance regressions or semantic errors during optimization.
Scalability and Adaptability: The retrieval-augmented, knowledge-fused generation mechanism allows efficient scaling across code bases and rapid adaptation to new optimization goals—whether code size, execution speed, or other metrics.

7. Future Directions

Future extensions envisaged in the paper include:

Broader Objective Functions: Extending the framework to optimize for runtime, energy, or other custom objectives beyond code size reduction.
Enriched Knowledge and Reasoning: Incorporating more advanced forms of symbolic knowledge, such as deductive or counterfactual reasoning modules, and expanding the empirical corpus.
Improved Reward Shaping: Research into finer-grained, context- and state-dependent reward functions to improve training efficiency and agent interpretability.
Co-scaled Model and Data Expansions: Scaling up model capacity and dataset diversity, as well as exploring multi-agent reasoning configurations for parallel optimization.
Integration with Emerging Compiler Stacks: Tighter coupling with rapidly evolving compiler infrastructures (e.g., LLVM advances, multi-level IR ecosystems) to ensure continued performance and adaptability.

In summary, AwareCompiler represents a systematic, agent-driven approach that merges domain knowledge and large-scale learning to automate compiler optimization pass sequence synthesis. Its context-aware, constraint-enforced decision pipeline, built on robust empirical evaluation, significantly outperforms existing baselines in both optimization effectiveness and the structural validity of generated transformations. This positions AwareCompiler as an influential advance in the domain of AI-empowered compiler optimization (Lin et al., 13 Oct 2025).

PDF Markdown Chat (Pro)

References (1)

AwareCompiler: Agentic Context-Aware Compiler Optimization via a Synergistic Knowledge-Data Driven Framework (2025)

Follow Topic

Get notified by email when new papers are published related to AwareCompiler.