Routing with Generated Data (RGD)

Updated 15 January 2026

Routing with Generated Data (RGD) is a framework that uses synthetic data to simulate queries and traffic patterns, improving routing optimization in various domains.
It generates annotated traffic matrices and synthetic instances to emulate complex network conditions for robust policy training and benchmarking.
RGD employs techniques like DRL and GNN-based architectures to achieve higher routing efficiency and scalability compared to traditional heuristics.

Routing with Generated Data (RGD) encompasses a family of routing algorithms and learning frameworks in which synthetic, annotated, or derived data is explicitly generated and leveraged to improve model selection, policy learning, or network optimization. RGD subsumes approaches in networking, circuit design, and AI model selection where the availability of labeled real-world data is limited or insufficiently diverse for effective learning, necessitating the use of routing data or synthetic problem distributions to facilitate training, benchmarking, or operation.

1. Formal Definitions and Foundational Principles

The RGD setting is characterized by the dependence on generated queries, problem instances, or traffic matrices for routing or expert selection. In general, the RGD workflow can be formalized as follows:

Model Pool: $M = \{m_1, \ldots, m_M\}$ , encoding candidate routing agents or AI models.
Data Generator: $G$ is a generator (often an LLM or synthetic sampler), taking high-level descriptions $d_t$ (e.g., network specs, task types) and producing synthetic pairs $(\hat{q}_i, \hat{a}_i)$ or traffic instances.
Generated Training Data: $\hat{D}_{\text{train}} = \{(\hat{q}_i, \hat{a}_i)\}_{i=1}^N$ .
Router/Policy: $R: q \mapsto$ subset of $M$ or policy outputs, mapping a query or routing request to a decision.

The canonical RGD objective is maximizing expected accuracy, efficiency, or utility on downstream real (often out-of-distribution) tasks using only generated data:

$\max_{R} \mathbb{E}_{(q,a)\sim \hat{D}_{\text{train}}} \left[ \mathbb{I}\{ m^*(q) = R(q) \} \right]$

where $m^*(q)$ denotes the true best model or route for $q$ according to ground truth or aggregate performance (Niu et al., 14 Jan 2026).

2. Synthetic Data Generation in Routing and Traffic Engineering

Synthetic data generation is foundational for RGD in systems and networking contexts. For example, in intradomain traffic engineering, traffic demands are generated as sequences of demand matrices $D^{(t)} \in \mathbb{R}^{|V|\times|V|}$ with temporal and statistical regularities—often constructed as “cyclical” sequences or bimodal mixtures to induce realistic, variable congestion (e.g., “elephant flows”) (Hope et al., 2021). In global routing for circuit design, generators parameterize grid dimensions, number of nets, pin distributions, edge capacities, and congestion injection policies to yield a diverse instance space for reinforcement learning and benchmarking (Liao et al., 2019).

Table 1 summarizes exemplary RGD generation regimes:

Domain	Generation Modality	Data Instances
Traffic Eng	Bimodal DM sequences, normal dist.	$(D_{ij}\sim N(\mu,\sigma))$
Circuit	Grid/topology, pins, congestion	$(X,Y,Z), N, c_e$
LLM Routing	Synthetic queries and answers	$(\hat{q}_i, \hat{a}_i)$

The rationale for synthetic data is threefold: enabling learning on structurally diverse or rare problem types, providing scenarios not easily captured in logs, and supporting continual testing for generalization to unseen networks or query types.

3. RGD in Traffic Engineering and Networking

RGD enables advanced data-driven policies surpassing classical heuristics in dynamic, resource-constrained routing:

Graph Neural Network-Based Data-Driven Routing (GDDR): Traffic demand matrices generated as described above are input to a DRL framework where node features $x_i^{\text{out}}, x_i^{\text{in}}$ summarize local traffic and GNNs parameterize an encode–process–decode architecture to output edge routing weights (Hope et al., 2021). Routing decisions are realized via softmin over edge weights post-DAG-pruning to ensure loop freedom, with Proximal Policy Optimization used for closed-loop learning.
TEARD Algorithm: In bandwidth-guaranteed routing, TEARD exploits historical routing logs to generate ingress–egress request probabilities $P_{ie}$ and link usage frequencies $F_\ell$ . These statistics, along with path-set criticality and instantaneous network state, are fused to derive link weights for shortest-path calculation, integrating both generative statistics and real-time adaptation (Thanh et al., 2014).
Performance Gains: GNN-based DRL policies generalize seamlessly to unseen topologies, consistently matching or improving on MLP-based baselines in link utilization and generalization metrics, with lower retraining costs and improved scalability. The integration of generated usage frequencies and path criticalities in TEARD yields 2–4% higher acceptance ratios over state-of-the-art alternatives across several topologies, at substantially reduced per-demand runtime (Thanh et al., 2014, Hope et al., 2021).

4. RGD for LLM Skill Routing and Expert Selection

RGD extends to LLM and expert selection, where generated queries and answers (with or without ground truth annotations) facilitate model skill estimation, router training, and expert selection:

Routing with Generated Data Framework: Routers are trained solely on pairs $(\hat{q}_i, \hat{a}_i)$ and associated model response vectors, where all data are generator-produced. Routers may be query-answer (using both synthetic questions and answers) or query-only (using only synthetic questions and model outputs) (Niu et al., 14 Jan 2026).
Generator Quality Criteria: Effective training requires generator self-consistency (generator must answer its own queries accurately) and performance differentiation (generator’s questions must produce significant variance among model pool responses). Empirically, generator choice and subsequent filtering (by consensus and question difficulty) are critical—only queries with self-consistency above threshold $\tau_S$ and differentiation above $\tau_\Delta$ are retained, securing performance close to real labels (Niu et al., 14 Jan 2026).
CASCAL Router: The consensus-based, query-only CASCAL algorithm uses model vote agreement weighted by normalized likelihoods (Z-scores) for correctness proxies, and employs k-means-based clustering on query embeddings to discover model skill niches, enabling robust per-query ensemble selection even with low-quality generators. CASCAL outperforms the best query-answer routers by 4.6% absolute accuracy on weakly-labeled Exaone data, with only 2–3pt degradation as generator quality drops (compared to 8–10pt for QA routers) (Niu et al., 14 Jan 2026).

5. Fusion and Routing with Large Model Ecosystems

Routing with generated data underpins systematic fusion methodologies for heterogeneous LLM ecosystems, leveraging logs from diverse model routings to improve aggregate performance:

FusionBench and FusionFactory: Large routing logs (over 100M tokens of queries, responses, and judge scores across 20 models and 14 benchmarks) provide rich “routing data”—synthesized during practical query routing by backend selectors (Feng et al., 14 Jul 2025). Query-level, thought-level (retrieval-augmented templating), and model-level (distillation) fusion protocols iteratively use performance-labeled and judge-scored logs to optimize model routing, template libraries, and student model fine-tuning.
Performance Differentials: All routing/fusion levels beat the best single LLM across benchmarks, with thought-level fusion (retrieval-augmented prompt completion using top model “thought templates”) yielding the strongest relative improvement (+4–12%). Query-level GraphRouter methods offer the best cost–performance tradeoff, and all systems benefit from multi-pattern logs capturing diverse reasoning routes and emergence of latent model strengths (Feng et al., 14 Jul 2025).
Best Practices: Multi-pattern log collection, calibration by LLM judges, cost-aware router training, and careful template/distillation selection are essential for RGD effectiveness in LLM routing contexts.

6. RGD in Deep Reinforcement Learning for Routing and Path Planning

Synthetic instance generation underpins DRL-based routing in hard combinatorial domains:

DRL for Circuit Global Routing: Parameterized instance generators produce grid networks, capacity constraints, reduced edges, and pin placements (Liao et al., 2019). The DRL agent, trained with these synthetic problems, simultaneously optimizes future congestion and wirelength, outperforming A* on sparse and congested benchmarks.
Policy Robustness and Generalization: Training on diverse synthetic datasets facilitates policy transfer to larger or novel topologies; burn-in memory experiences from A* or randomly generated instances accelerate agent convergence and enable superior solutions under resource constraints (Liao et al., 2019).

7. Limitations, Challenges, and Future Directions

RGD approaches are subject to several challenges:

Generator Dependence: Router and policy performance are critically dependent on generator quality—poorly calibrated or non-differentiating queries limit the discoverability of model or policy skill niches. Filtering on self-consistency and differentiation is necessary for robust downstream results (Niu et al., 14 Jan 2026).
Online Adaptation and Scalability: Techniques such as incremental statistics for TEARD, scalable summarization for LLM logs, and continual topology adaptation for GNN-based policies mitigate the cost of frequent topology or demand changes, but real-time adaptation in high-churn environments remains nontrivial (Thanh et al., 2014, Hope et al., 2021).
Extensibility: Extending RGD frameworks to multi-constraint QoS, segment routing, multi-task model fusion, and real-time domain transfers presents ongoing research questions. In systems applications, the availability and reproducibility of generation and evaluation code bases accelerate iterative improvement and cross-domain applicability (Hope et al., 2021, Feng et al., 14 Jul 2025).

A plausible implication is that as RGD methods mature, the integration of high-quality, adaptive generators and robust query-only route selection protocols (such as CASCAL) will underpin state-of-the-art performance in both network routing and LLM system orchestration, especially in low-annotation or novel deployment conditions.