Hierarchical and Grouped Routing

Updated 5 December 2025

Hierarchical and grouped routing is a strategy that organizes network nodes into multi-level and group-based clusters to improve scalability and efficiency.
It enables adaptive routing decisions by combining local and global context through specialized leader nodes or expert models for resource allocation.
Applications span wireless sensor networks, vehicular logistics, modular neural architectures, and physical routing, achieving reduced latency and enhanced performance.

Hierarchical and grouped routing refers to a class of routing strategies and algorithms that leverage multi-level (hierarchical) or intrinsic group (cluster or expert) structure to decompose, organize, or optimize routing decisions in large-scale, dynamic, or heterogeneous networks and systems. Hierarchical decompositions provide scalability, modularity, and opportunities to aggregate or specialize routing functionality, while grouping allows for resource-efficient, parallel, and context-aware routing. These approaches are pervasive in domains including packet-switched networks, wireless sensor networks, vehicular and ride-sharing logistics, photonic circuit layout, distributed reinforcement learning, as well as modular neural architectures such as Mixture-of-Experts models.

1. Fundamental Principles and Formal Models

Hierarchical routing divides a network into recursively organized levels, clusters, or groups, with routing decisions delegated across these layers. In a standard communication network, nodes are partitioned into clusters at multiple levels (from local to global), typically with each cluster containing a designated leader or coordinator. A hierarchical route request is propagated through this nested decomposition: lower-level leaders handle local or short-range decisions, while higher-level leaders are responsible for inter-cluster routing or aggregation. The network state is represented as tuples or vectors reflecting both the node's local group and its ancestors. Each leader operates as a stateless routing agent, specializing in local link decisions, with the global path composed from individual, per-level choices (Ali et al., 2019).

Grouped routing, while overlapping with hierarchical principles, emphasizes partitioning nodes, requests, or computational units into sets (groups, clusters, expert pools) wherein routing or allocation is optimized within or across these sets. In modern neural Mixture-of-Experts systems, grouped routing implies restricting expert selection to a task-specific or context-derived subset, with further token-level, context-aware refinement among candidates (Liang et al., 20 May 2025, Huang et al., 26 Jul 2024).

General mathematical formulation: Let $G = (N, E)$ be a directed or undirected graph. The set of nodes $N$ is divided into cluster/group sets $\mathcal{C}_1, \mathcal{C}_2, \ldots, \mathcal{C}_H$ , each possibly with a leader $l_{\mathcal{C}_h}$ at hierarchy level $h$ . Hierarchical routing involves recursive route-discovery or decision-making across increasing $h$ , while grouped routing utilizes selection, allocation, or aggregation within a group or context-dependent subset of $N$ .

2. Network Routing: Cluster and Hierarchy Decomposition

Hierarchical cluster-based routing has been a central paradigm in communication and sensor networks. Nodes are physically or logically grouped into clusters at multiple hierarchy levels. Data and control traffic are aggregated at intermediate cluster heads, reducing the burden on individual nodes, balancing the energy consumption, and allowing for scalable path computation. Clustering strategies—such as dynamic, residual-energy-based rotation for cluster head election—mitigate premature depletion and bottlenecks, as in protocols for wireless sensor networks and MANETs (Diop et al., 2013, Latif et al., 2012, Sensarma et al., 2013).

In SDN and complex network environments, hierarchical routing provides scalable, end-to-end path computation that outperforms both fully centralized and fully distributed approaches. For instance, in Hierarchical Deep Double Q-Routing, nodes are grouped into clusters at $H$ levels; source nodes and leaders cooperate recursively to compute a path, and each leader applies a deep Double Q-network (DDQN) to select local next-hops based on both global (end-to-end QoS, deferred reward) and local (link utilization, delay, queueing, fairness) metrics. The source eventually assembles the end-to-end path recursively from per-level choices (Ali et al., 2019).

Simulation results demonstrate that such approaches yield faster convergence, improved compliance with utilization thresholds, and enhanced scalability (per-leader memory/CPU is $O(|E_{\mathrm{local}}|)$ instead of $O(|E_{\mathrm{total}}|)$ ). The composite hierarchical reward design, fusing global and local signals, is crucial for performance.

3. Grouped and Hierarchical Routing in Modular and Expert-based Architectures

Contemporary neural architectures have adapted hierarchical and grouped routing for computational efficiency and specialization in Mixture-of-Experts (MoE), LoRA-adapter ensembles, and Transformer decompositions.

In THOR-MoE for neural machine translation, routing is separated into two stages: first, a hierarchical, task-level expert pre-selection narrows the candidate set using a soft domain/language classifier; second, a context-aware, token-level router further refines this selection by combining token representations with global context and applies Top- $k$ or Top- $p$ gate within the task-restricted expert set. This structure ensures activation efficiency (as few as 22% of experts per token) and improved BLEU score, with additional auxiliary losses to ensure both load-balance and routing diversity (Liang et al., 20 May 2025). Similarly, DLG-MoE for code-switching ASR builds language groups with explicit LID-based dispatch, then applies an unsupervised intra-group router per frame over accent, domain, or other learned factors, yielding improved recognition error rates and compositional flexibility (Huang et al., 26 Jul 2024).

In HiLoRA, LoRA modules are grouped and routed over two stages: a sequence-level, Gaussian-likelihood-based LoRA subset and ROC (rank-one component) allocation, followed by token-level ROC selection based on relevance to the current input. Theoretical guarantees bound the probability of failing to select the relevant adaptation module, and experiments confirm strong domain generalization and inference efficiency (Han et al., 14 Oct 2025).

The Union-of-Experts Transformer decomposes every block into equitant tensor-parallel experts and implements hierarchical (patch-wise) and grouped (sample-wise) routing paradigms, applying top- $k$ routing to patches or samples per block and gaining both significant accuracy and hardware efficiency at scale (Yang et al., 4 Mar 2025).

4. Hierarchical/Grouped Routing in Combinatorial and Physical Routing Problems

Hierarchical and grouped routing underpins efficient solutions to complex combinatorial and design problems such as vehicle routing, dial-a-ride, and physical circuit layout.

For multi-depot vehicle routing and location-routing, hierarchical decomposition is enacted by first partitioning customer assignments to depots (high-level), then solving the corresponding capacitated VRPs per depot. Modern methods leverage surrogate models—such as a neural cost predictor for inner VRP subproblems—to efficiently evaluate high-level grouping, yielding near-optimal solutions with significant computational speedup (Sobhanan et al., 2023).

In ride-sharing (Multi-Vehicle DaRP), hierarchical grouping is used to partition requests into compact, capacity-respecting blocks with low minimum spanning tree cost, then to assign groups to vehicles; this achieves $O(\sqrt\lambda \log n)$ -approximate solutions, which are substantially better than greedy baselines in both total distance and ride latency (Luo et al., 2022).

LiDAR 2.0 adapts hierarchical routing for photonic integrated circuits, first routing internal waveguides inside repeated modules (bottom-up), then composing at higher levels, and combining this with explicit port-grouping and dynamic crossing-conflict strategies. Hierarchical routing yields a 7× speedup and 9–16% reduction in insertion loss over monolithic approaches (Zhou et al., 22 May 2025).

5. Hierarchical and Grouped Routing in Dynamic and Complex Networks

In dynamic complex networks, such as Internet-level communication graphs, hierarchical and grouped routing can be realized by installing controllers or RL agents at a select set of topologically central nodes (by betweenness centrality or similar metrics). These nodes manage local or regional bypasses, precomputed to span a spectrum of network “marginality.” Each RL agent optimizes a local bypass parameter $\beta$ , affecting the preference for core/periphery traversal, and cooperatively balances traffic to avoid congestion. Empirically, with only $1\%$ of nodes privileged as RL agents, system transport capacity can increase by an order of magnitude compared to purely shortest-path routing (Hu et al., 2022).

Analysis of real-world paths in disparate networks shows that operational routing frequently, but not exclusively, prefers hierarchically conform paths (i.e., those with monotonic rise then fall in centrality), as well as "prefer downstream" (avoid upstream, core-seeking detours). Synthetic routing policies capturing these preferences reproduce observed path stretch and load concentration much more accurately than pure shortest path routing, indicating the ubiquity and necessity of hierarchy-conformant policies in real systems (Csoma et al., 2017).

6. Algorithmic Patterns and Empirical Performance

A unifying structure in hierarchical/grouped routing across domains consists of:

Hierarchical Decomposition: Partitioning entities (nodes, requests, weights) recursively or by context; bottom-up or top-down routing/refinement across levels.
Group-wise Resource Allocation: Specialized allocation or routing within groups, often with adaptive or learned expert selection at finer granularity.
Task or Context Response: Leveraging contextual predictions (task/domain/label) to restrict or prioritize routing subsets for efficiency and generalization.
Auxiliary Objectives: Load balancing, diversity, and entropy control at both hierarchical and group scales to mitigate routing collapse and parameter under-utilization.
Deferred or Composite Reward/Fitness: In RL, leveraging both local “micro” and end-to-end “macro” signals stabilizes learning and aligns agent incentives across scales.

Empirical tables from the data demonstrate robust gains in end-to-end performance, utilization efficiency, and computational cost. For instance, LiDAR 2.0 achieves up to 7.69× faster routing and 9–16% lower insertion loss; THOR-MoE yields up to +1 BLEU with activation of only 22% of parameters; UoE Transformers halve FLOPs while improving accuracy by 0.68% or more compared to all baselines in the Long Range Arena benchmark (Zhou et al., 22 May 2025, Liang et al., 20 May 2025, Yang et al., 4 Mar 2025).

7. Implications, Extensions, and Best Practices

Hierarchical and grouped routing frameworks enable significant scalability, resource utilization efficiency, and improved quality-of-service in large and heterogeneous networks, architectures, and physical systems. Best practices include:

Designing clustering/grouping mechanisms that exploit domain structure (energy, centrality, label, context).
Employing adaptive or learned routing at both hierarchy and group levels, and fusing local and global objectives.
Including auxiliary diversity and load-balancing losses.
Leveraging surrogate models for hierarchically nested optimization problems to handle large scales.
Ensuring robust fallback and re-routing strategies under failures or mobility, especially in wireless and physical networks.

In summary, hierarchical and grouped routing systematically exploit multi-level or group structure to tame complexity, balance load, and enable context- or domain-specialized operation in systems spanning sensors, transportation, neural architectures, and complex networks. Contemporary research unifies these approaches through algorithmic decompositions, gated/learned resource allocation, and context-driven refinement, providing both theoretical and practical advances across domains (Ali et al., 2019, Liang et al., 20 May 2025, Zhou et al., 22 May 2025, Han et al., 14 Oct 2025, Huang et al., 26 Jul 2024, Csoma et al., 2017).