- The paper presents the DTAR framework that integrates NSGA-II partitioning with a GAT-enhanced, action-masked PPO for optimal inter-domain routing in LEO satellite networks.
- It utilizes a graph-based encoder to process real-time traffic, fault, and load data, thereby reducing delay, alleviating congestion, and increasing routing success.
- Experimental results indicate that DTAR improves routing success by up to 9.25 percentage points over traditional methods under traffic surges and link faults.
Traffic-Aware Domain Partitioning and Load-Balanced Inter-Domain Routing in LEO Satellite Networks
Introduction and Problem Context
The increasing scale and criticality of Low Earth Orbit (LEO) satellite networks, exemplified by deployments such as Starlink and OneWeb, have made efficient resource management across dynamically evolving, large-scale topologies a foundational challenge. Key performance bottlenecks arise from high node mobility, non-uniform and spatiotemporally fluctuating traffic patterns, and frequent stochastic link failures intrinsic to LEO constellations. Existing protocols—ranging from classical shortest-path routing to Q-learning and load-weighted heuristics—are inadequate in jointly optimizing load balance, adaptability to real-time link states, and robustness against faults in the inter-domain routing context. This work presents the DTAR framework, which systematically integrates multi-objective evolutionary optimization and deep reinforcement learning to deliver adaptive, reliable, and load-balanced inter-domain routing in partitioned LEO satellite networks.
Figure 1: The LEO satellite inter-domain routing problem: load-balanced path selection across partitioned domains.
DTAR Framework: Design and Methodology
DTAR operates through a two-stage process decoupling global, offline domain partitioning from online, real-time adaptive routing. The first stage utilizes NSGA-II to construct traffic-aware, load-balanced domain partitions, while the second stage employs a GAT-based encoder and an action-masked PPO agent for real-time path selection on the inter-domain graph.
Figure 2: The proposed DTAR framework.
Offline Domain Partitioning via NSGA-II
Given a detailed traffic matrix and LEO topology, DTAR applies NSGA-II to optimize two objectives in domain partitioning:
- Maximize Intra-Domain Traffic Ratio (IDTR): Encourages more traffic to be routed within domains, localizing congestion and reducing cross-domain coordination overhead.
- Minimize Inter-Domain Load Deviation (σL​): Ensures that traffic loads are evenly distributed across domains, structurally underpinning load balancing.
NSGA-II is adapted for high-dimensionality with genetic encoding of domain assignments, diverse crossover and mutation operators, and multi-phase repair operations enforcing constraints (size and connectivity). A Pareto-optimal partition is then selected to maximize IDTR gain and minimize σL​.
Domain-Level Graph Representation and Real-Time State Encoding
Within each partitioned domain, intra-domain routing employs standard shortest-path logic and is orthogonal to this work’s focus. The inter-domain topology is modeled as a graph with dynamic node and edge attributes encoding real-time traffic, fault, and load information. For robust DRL policy learning, a two-layer GAT aggregates these features, mixing information from both node states and edge attributes to generate domain embeddings sensitive to link loads, failures, and congestion surges.
Online Adaptive Routing with Action-Masked PPO
The core of DTAR’s online adaptation is a PPO agent operating over the domain-level topology, using GAT-produced embeddings as inputs. The action masking mechanism ensures only feasible (physically reachable and within hop constraints) neighbors are available at each decision step, preventing wasted updates on invalid transitions due to faults or budget violations. The composite reward function unifies incentives for path efficiency and success rate, with shaping for directionality, hop penalty, arrival rewards, and hard penalties for routing failures.
Experimental Results and Analysis
Simulation Setup
A simulator emulates a 288-satellite Walker constellation with dynamic scenarios encompassing normal operation, localized surge, and random link failures. Baseline methods—Dijkstra, ELB, QRLSN, and CDPAR—are used for comparative evaluation, representing varied strategies for LEO routing.
Main Results: Performance Across Scenarios
DTAR demonstrates statistically significant superiority over strong baselines across core metrics, including inter-domain link load CV, end-to-end delay (measured as inter-domain hops × average propagation delay), packet loss rate, and routing success rate.




Figure 3: Performance comparison of routing algorithms under normal, surge, and fault scenarios.
- Normal Scenario: DTAR achieves markedly lower CV and end-to-end delay compared to ELB, QRLSN, and CDPAR. Notably, approaches leveraging load-spreading heuristics without explicit domain or real-time awareness often underperform even simple shortest-path on CV.
- Traffic Surge: DTAR effectively redistributes load, maintaining the lowest CV during 5× traffic bursts. The GAT’s surge hotspot feature enables preemptive detouring before congestion cascades.
- Link Faults: DTAR raises success rates by up to 9.25 percentage points over the hop-based Dijkstra baseline due to action masking and fault-aware edge encoding, with corresponding reductions in packet loss. Fault-agnostic baselines (ELB, QRLSN) drop below 80% success.
Component Analysis via Ablation
Ablation studies isolate the contributions of the NSGA-II partitioner and the GAT encoder. Replacing either results in pronounced CV degradation under all test conditions.
Figure 4: Ablation study: CV comparison across three scenarios.
- The NSGA-II partitioning is especially critical under fault scenarios, anchoring network resilience structurally.
- The GAT encoder’s benefits are pronounced under surge, equipping the PPO agent for precise, traffic–state-aware rerouting.
Training Dynamics
DTAR achieves faster and more stable convergence in RL training compared to CDPAR and QRLSN, attributed to the interplay of graph-structured input, action masking, and partition-induced traffic localization.
Figure 5: Training convergence comparison.
Implications and Future Directions
DTAR introduces a scalable, traffic- and topology-aware paradigm for inter-domain routing in LEO satellite networks, combining Pareto-optimal partitioning with graph-enhanced deep RL and action feasibility constraints. By lowering load imbalance, minimizing delay, and boosting reliability under dynamic load and topology disturbances, DTAR substantiates the value of hybrid evolutionary-DRL pipelines for future space-based networks.
The research also highlights several theoretical implications:
- Explicit domain-based graph abstraction, when aligned with traffic profiles, significantly improves RL-driven load balancing.
- Real-time embedding of link load and fault information via GNNs provides a decisive advantage over vanilla FC architectures.
- Action masking is essential for policy learning in non-stationary, partially observable networking environments.
Potential avenues for future work include integration of intra-domain RL optimization, generalization beyond Walker-type constellations, and extension to multi-layer (hybrid terrestrial–space) and cross-provider constellations.
Conclusion
DTAR offers a comprehensive solution to the persistent challenge of load-balanced, resilient inter-domain routing in LEO satellite networks by integrating offline evolutionary partitioning and online GAT-enhanced RL routing with strict action feasibility constraints. Numerical results substantiate its superiority in diverse scenarios. The framework’s modularity and adaptivity lay a foundation for extending RL-based optimization to even larger, more heterogeneous networked systems in space and beyond.