- The paper introduces A1gent, a framework that splits reasoning and execution to safely enforce operator intents in O-RAN with full auditability.
- It combines LLM-assisted intent translation with deterministic, near-real-time actuation under Infrastructure-as-Code guardrails to mitigate instability and safety gaps.
- Experimental evaluations show significant gains in throughput, SINR, fairness, and load balancing, particularly during emergency and recovery phases.
Agentic Open RAN: Deterministic and Auditable Framework for Intent-Driven Radio Control
Motivation and Context
The proliferation of AI-native wireless networks has accentuated the disparity between operator-driven intents and low-level deterministic actuation. The O-RAN architecture enables the abstraction of control classically via the A1 (policy), E2 (near-real-time control), and O1 (network management) interfaces, yet challenges persist regarding (i) preserving deterministic guarantees in the presence of LLM-driven agentic reasoning and (ii) ensuring replayable and auditable control at scale. Prior agentic solutions such as AgentRAN and Dual-MCP demonstrate the feasibility of intent-level composition but suffer from instability and safety gaps when integrating LLM-driven adaptation tightly with near-RT loops. The present work, A1gent, confronts these challenges through a rigorously structured reasoning–execution split, embedding agentic orchestration in the Non-RT domain and enforcing safety-bound deterministic loops within the Near-RT RIC.
Figure 1: The agentic control loop delineates data-layer telemetry flow and policy routing across rApp, xApps, E2, and O1, enforcing phase-scoped actuation and deterministic execution.
Architectural Design
A1gent institutes a hierarchical control pipeline comprising a Non-RT orchestrator rApp and three Near-RT xApps specializing in QoE protection, load balancing, and energy saving. The rApp ingests operator intents (expressed in natural language), topological data, and telemetry summaries, invoking an LLM-assisted compiler under Infrastructure-as-Code (IaC) constraints to output typed A1 policy instances. These instances are dispatched via channels to xApps that actuate through E2 (mobility and load), O1 (energy), and are governed by immutable safety envelopes and fine-grained IaC guardrails.
Figure 2: The full-stack agentic architecture employs LLM for intent translation, hierarchical policy selection/objective blending, and IaC for policy enforcement and audit trails.
Typed A1 contracts ensure every xApp registers policy schemas, with instances validated and versioned for full auditability. Conflict management is realized through a fixed-priority action merger (Energy → QoE → Load), deduplicating actions and applying clamps, cooldowns, and global budgets before interface dispatch.
Adaptive Policy Tuning and Safety Governance
A distinctive feature is the Adaptive Policy Tuner (APT), which operates training-free, proposing incremental edits to soft policy tunables based on rolling KPIs and logged action-outcome pairs. This module inspects cellular and UE-level distributions, adjusts parameters within rate limits and per-field clamps, and logs all changes with lineage and reason for traceability. Hard guardrails (parameter ranges, dwell times, active sector limits) are enforced both at policy publication and actuation, keeping adaptation predictable without retraining and preserving system explainability.
Agentic Control Algorithms
The orchestrator executes a 1 Hz tick, compiling rolling KPI views and invoking LLM prompts for phase selection and policy value instantiation. Fallback heuristics ensure continuity in case of LLM timeout. Each xApp operates short-horizon deterministic control, supplementing non-blocking LLM prompts for context-aware actuation:
- QoE Firefighter: Evaluates PRB headroom and SINR, proposing E2-level HOs for distressed users under dwell-time and handover constraints.
- Load Balancer: Identifies PRB gaps, issues RC handover-trigger offset updates and targeted MHO, ensuring stability via priority ordering and controlled granularity.
- Energy Saver: Monitors idle and active states, orchestrates sleep/wake intents via O1 using PRB, scheduling, and activity thresholds.
Experimental Evaluation
Experiments on ns3-oran urban macrocell topology with heterogeneous UE mobility and multi-flow traffic (eMBB, URLLC, V2X, mMTC) assess policy efficacy under three phases: Normal, Emergency (surge), Recovery.
Emergency Phase: Tail Protection
A1gent demonstrates substantial improvements in tail robustness and fairness during traffic surges:
- p10 Downlink Throughput: Increases from 0.280 to 0.336 Mbps; p90 is compressed from 2.230 to 1.080 Mbps, indicating effective upper-tail compaction for resource reallocation.
- SINR p10: Improves from 0.34 to 0.84 dB under agentic control.
- Outage Fraction (<0.10 Mbps): Decreases by 1.5 percentage points compared to baseline.
- Emergency p90/p10 Ratio: Down by 59.7%, confirming fairness enhancement.
Figure 3: Percentile throughput distributions during the emergency phase showcase lower outage probability and tighter tail fairness under agentic control.
Figure 4: Affected UE timeline reveals rapid throughput recovery and sustained improvements for edge and incident-zone UEs post-surge under adaptive orchestration.
Recovery Phase: Sustained QoE and Load Rebalance
During network recovery, A1gent achieves:
- 20th Percentile DL Throughput: 24% increase (from 0.298 to 0.370 Mbps).
- Incident-Zone Load Share: Rises from 17% to 44%, exemplifying spatial rebalancing.
- Extreme Low-Rate Tail: Further decreased by 3.4 percentage points.
Stability in uplink wideband interference and packet loss attests to consistent envelope enforcement.
Mobility and Stability
Agentic policies induce structurally bidirectional mobility flows, mitigating baseline-concentrated events. Emergency-phase 90th-percentile dwell-time rises from 43s to 53s (reduced ping-pong HOs); 99th percentile remains stable post-recovery. The orchestrated algorithms ensure deterministic, reproducible mobility transitions without sacrificing real-time guarantees.
Figure 5: Capability radar encapsulates agentic control performance gains across throughput, fairness, spatial load rebalance, and energy savings relative to baseline.
Implications and Future Directions
A1gent establishes a formalized reasoning–execution split, strictly separating LLM-driven intent translation from deterministic, safety-constrained actuation. This paradigm enables reproducible, auditable control in mission-critical RAN environments, advancing explainability and verifiability absent in previous agentic systems. The practical implication is the deployment of self-governing radio access networks capable of operator-level intent fulfillment, bounded adaptation, and robust conflict management without the instability or opacity characteristic of unbounded LLM loops.
Theoretical extensions include expanding the typed A1 policy catalog, integrating hardware-in-the-loop for E2/O1 interfaces, and cross-plane agentic synthesis for additional O-RAN control domains (e.g., slicing, transport orchestration). Immediate practical avenues involve merging multi-phase conflict evaluation tools (PACIFISTA-style governance) with agentic autonomy and exploring cross-vendor interoperability leveraging standardized schema contracts.
Conclusion
A1gent introduces a deterministic and auditable agentic control stack for O-RAN, enabling safe, intent-driven orchestration without compromising real-time guarantees or explainability. By embedding LLM and IaC-based adaptation in the Non-RT domain and strictly policing actuation in the Near-RT loops, the framework achieves reproducible and traceable multi-agent coordination with strong numerical gains in throughput, fairness, and spatial load distribution. Future research will focus on hardware-level integration and policy set expansion to further advance scalable, AI-native radio management.