- The paper presents a neuro-symbolic framework combining agentic AI with a Network Digital Twin to enable intent-driven network change validation.
- It demonstrates high performance with 94–100% error detection and reduced validation latency across synthetic benchmarks and real-world ISP scenarios.
- The modular architecture integrates LLM-based agents and extensible verification tools, paving the way for scalable, automated network assurance.
Agentic AI and Digital Twin-Driven Network Change Validation: An Expert Analysis of Aether
Motivation and Problem Statement
Contemporary network operations suffer from fragmented, error-prone, and predominantly manual processes for network change validation. Despite progress in formal network verification, practical deployments face challenges: tool heterogeneity, lack of semantic validation, insufficient test coverage, and limited operational integration inhibit scalable, robust assurance. Aether proposes a neuro-symbolic architecture that synergistically combines Generative Agentic AI with a multi-functional Network Digital Twin (NDT) to streamline and automate network change validation end-to-end.
Figure 1: Network change validation process comprising intent analysis, impact assessment, peer review, testing, and approval.
Aether is motivated by three deficiencies in existing approaches:
- Fragmentation: Incompatibility across model-based, simulation-based, and emulation-based tools creates cognitive and operational friction for operators.
- Cognitive Barriers: Operators must bridge the gap between NL intent and abstract specifications in verification tools, for which existing CI/CD pipelines and static test suites are inadequate.
- Agent Operational Requirements: Effective LLM-based agents face contextual and reasoning constraints requiring structured, queryable, and semantic representations of network state.
Aether’s architectural principles thus emphasize compositional intent-aware orchestration, unified semantic network representation (the NDT and Knowledge Graph), and seamless integration with NetDevOps/CI workflows.
Aether Architecture
Aether’s architecture is centered on orchestrating specialized AI agents over a unified NDT, designed for extensibility and compositional tool invocation. It leverages a set of five NetOps-specialized LLM-based agents mapped to canonical change management stages (impact assessment, test planning, test execution, etc.), coordinated by a conversational Assistant Agent. The agents interface with a structured Knowledge Graph (the Network Digital Map, NDM) abstracted atop the NDT, and leverage compositional use of heterogeneous formal verification, simulation, and differential analysis tools.
Figure 2: The Aether workflow coordinates ingestion, intent analysis, test generation, execution, and operator-in-the-loop approval.
Figure 3: Logical architecture depicts five collaborating agent roles and their interactions with the NDT and orchestration services.
Agents and Reasoning Processes
All agents follow the ReAct design pattern, supporting dynamic tool invocation, self-correction, and iterative refinement. Separation of concerns is central: each agent encapsulates domain-specific skills, memory, and interface logic. This results in high modularity and extensibility—for example, introducing new protocol-specific or troubleshooting skills is tractable at the per-agent level. The NDM Query agent is particularly critical, mapping natural language to graph queries and serving as a foundation for all higher-level orchestration.
Network Digital Twin (NDT) and NDM
Aether's NDT provides a temporal, multi-layered, OpenConfig- (and Aether-specific-) extended Knowledge Graph representing snapshots of the network’s multi-facet state—physical topology, configuration, policies, performance metrics, and computed or inferred attributes—across per-device, per-layer granularity.
Figure 4: NDM Knowledge Graph structure with explicit layering and relationship semantics between device nodes and network properties.
The NDT exposes a “git-like” workflow for managing candidate change validation, supporting branching, rebasing, and differential validation via snapshotting. Verification tools (Batfish, Routenet, custom simulators, anomaly detectors) are tightly integrated via a Model Context Protocol (MCP), decoupling LLM agent actions from the execution and interpretation of underlying checks.
Figure 5: NDT workflow for snapshot-based differential analysis and candidate validation, analogous to Git operations.
Implementation
Aether leverages modern agentic protocols (A2A, SLIM), enabling agent interoperability and secure, low-latency communication. LLM-based agents (GPT-4o) are orchestrated via a common SDK (LlamaIndex), and are exposed to tool APIs and data schemas through explicit prompt scaffolding and tool description tokenization. The NDM leverages ArangoDB for high-performance graph operations, with ingestion pipelines modularized for integration with enterprise orchestrators (e.g., NSO).
Key extensions include Batfish enhancements (e.g., SRv6, complex IS-IS scenarios) and a plug-in framework for integrating GNN-based or simulation-based performance checking tools (such as Routenet), exposing high-level verification methods for agents.
Experimental Evaluation
Synthetic Benchmarks
Aether was evaluated on eight synthetic scenarios covering policy misconfigurations, protocol logic errors (e.g., route redistribution and summarization bugs), and latent path defects—selected for operational criticality and diagnostic diversity. Each scenario includes faulty and correct candidate configuration changes, multiple NL intent formulations, and both agent-level and end-to-end system evaluation using LLM-as-a-judge metrics (Correctness, Consistency, Robustness).
Figure 6: Single agent correctness across scenarios, quantifying output agreement with expert ground truth.
Empirically:
- Single-agent correctness consistently exceeds 0.7 across agents and tasks, with the NDM Query agent achieving the highest performance save for isolated schema-induced errors.
- E2E error detection reaches 94% for broken candidate blocking, and 100% for main error detection in most scenarios. Precision (correct classification) varies, highlighting conservatism and the impact of data/model limitations.
- Coverage and efficiency: Agent-generated testplans match or exceed ground truth requirements, with 92–96% diagnostic coverage and minimal redundancy.
Real-World Case Studies
Aether was deployed on a large-scale ISP network replica (25 routers, >30k production-equivalent configuration lines). Two historical change incidents—an IS-IS redistribution loop and an SRv6 prefix summarization blackhole—were adopted as test cases. Both involve complex, protocol-level validation uncapturable by static checks or configuration linters.
- Error detection: On both incidents, Aether blocked all faulty changes (100% detection), achieved main error detection with 100% precision, and test plan coverage exceeding 91%.
- Workflow latency: Average validation time is 6–7 minutes—an order-of-magnitude reduction vs. manual validation—of which >50% is attributable to verification tool execution, not agentic reasoning.
- Agent weaknesses: Errors predominantly arise from query generation or schema mapping in the NDM Query Agent under complex multi-layered data; improvements are possible via skill injection and per-agent schema enrichment.
Theoretical and Practical Implications
Aether demonstrates that neuro-symbolic, agentic AI can bridge the operational chasm between NL intent and compositional, rigorous network validation. By leveraging LLM-based agents for orchestration atop a modular Network Digital Twin, Aether achieves:
- Differential, intent-aware validation: Agents dynamically generate targeted tests and execute conditional checks based on NL-expressed goals, outperforming static CI/CD approaches in recognizing nuanced, business-critical faults.
- Extensible verification composition: The architecture allows rapid integration of new verification backends, protocol- or domain-specific skills, and customization by operators.
- Human-in-the-loop compatibility: HITL enforcement and natural language workflows facilitate operator oversight and facilitate integration in existing NetDevOps environments.
- Automation with guardrails: The approach yields substantial improvements in operational agility, reducing change validation latency and error prevalence, while retaining transparency and traceability.
Limitations and Future Directions
Challenges remain in achieving full autonomy and eliminating residual false positives, which require enhanced schema understanding, richer agent skills, and tighter feedback loops from operator corrections and post-mortem failure analysis. Integration with production CI/CD pipelines and real-time anomaly detection are next frontiers, as is the operationalization of relational specification languages such as Relational NetKAT for fine-grained, intent-aligned validation.
Advances in agent skill injection, multi-agent persistent memory, and the systematic curation of domain- and protocol-specific knowledge bases will make Aether and similar frameworks more robust and amenable to large-scale deployment.
Conclusion
Aether establishes a new paradigm for automated, intent-driven network change validation by uniting agentic AI orchestration with a robust, extensible Network Digital Twin. The empirical results—94–100% error detection, high coverage, significant reduction in manual effort and validation time—demonstrate practical viability for safe network automation. The approach is generalizable, supporting rapid introduction of new agents, protocol extensions, and operational policies, and lays groundwork for further research in multi-agent network autonomy, operator-AI co-piloting, and closed-loop assurance.