Lyapunov-Based Token Routing Framework

Updated 14 December 2025

Lyapunov-based token routing is a framework that leverages mathematical Lyapunov functions to ensure stability in dynamic, stochastic networks.
It employs quadratic, piecewise, and composite Lyapunov functions to model queue dynamics and optimize routing decisions across various network architectures.
Practical applications include enhanced throughput and efficient resource allocation in parallel, wireless, and distributed training environments.

A Lyapunov-based token routing framework provides a rigorous mathematical approach for designing, analyzing, and optimizing dynamic routing policies in stochastic networks, leveraging Lyapunov functions to guarantee system stability—even under adaptive or learning-based control. This methodology is foundational in modern queueing systems, parallel server routing, multi-hop wireless networks, and distributed model training, anchoring the practical implementation of stable, high-throughput, resource-aware routing protocols.

1. Conceptual Foundations

At its core, the Lyapunov approach introduces a scalar functional—typically a quadratic, exponential, or piecewise function of the system’s state vector—that captures the aggregate “congestion” or “energy” of the network. Stability is established by demonstrating that, under the routing policy of interest, the expected one-step (or infinitesimal) drift of this Lyapunov function is negative outside a compact region. This negative drift ensures that the system’s queue lengths (or analogous backlogs) do not diverge and that the network achieves positive Harris recurrence, i.e., mean-boundedness of the traffic or token state.

The Foster–Lyapunov theorem (Meyn–Tweedie) serves as the cornerstone: if a nonnegative function $V(x)$ satisfies $\mathbb{E}[V(x_{k+1})-V(x_k)|x_k=x]\leq -\alpha\|x\|+B$ for all $x$ outside a finite set, then the process is stable in the sense that $\sup_{k}\mathbb{E}[V(x_k)]<\infty$ (Wu et al., 27 Aug 2024, Wu et al., 19 Mar 2025, 0908.1273, Shi et al., 7 Dec 2025).

2. Model Classes and State Dynamics

The Lyapunov-based framework has been developed for multiple network classes:

Parallel and multi-hop queueing networks: State is represented as the vector of queue backlogs at nodes/servers. Routing actions may be immediate (parallel servers) or staged (multi-hop) (Wu et al., 19 Mar 2025, 0908.1273).
Flow-controlled single-origin-single-destination (SOSD) networks: Jobs are assigned to paths at arrival, and the key routing decision is path selection based on backlogs along candidate paths (Wu et al., 27 Aug 2024).
Distributed mixture-of-experts (MoE) training: Tokens (data batches) must be routed to resource-heterogeneous edge servers for computational processing, with additional constraints on energy and consistency (Shi et al., 7 Dec 2025).

In these frameworks, the system evolves as a Markov process, where transition dynamics depend both on exogenous arrivals (Poisson or batch) and endogenous routing decisions, possibly coupled to resource allocation variables (e.g., processor frequency, energy budget).

3. Lyapunov Function Construction and Drift Analysis

The choice of Lyapunov function is informed by network structure and desired performance tradeoffs:

Quadratic Lyapunov (e.g., $V(x) = \sum_n x_n^2$ ) in parallel systems provides algebraic tractability and directly penalizes large queues, yielding strong negative drift under balanced routing (Wu et al., 19 Mar 2025).
Piecewise-linear or piecewise-quadratic Lyapunov functions encode bottleneck-aware path costs or rank-based priority (Editor’s term: Lyapunov-driven bottleneck metrics). For example, in SOSD networks, path costs $Q_p(x)$ are defined as maxima over linear segmentations of queue summaries, tailored to capture per-path congestion (Wu et al., 27 Aug 2024, 0908.1273).
Composite Lyapunov functions for systems with multiple resource constraints (e.g., queue and energy) take the form $L(t) = (1/2)\sum_j [Q_j(t)^2 + Z_j(t)^2]$ (Shi et al., 7 Dec 2025).

The essential analytic step is the verification that the conditional drift $\Delta V(x)$ or $\mathcal{L}V(x)$ is negative outside a bounded set, under the routing policy parametrized by the function itself. For instance, in the semi-gradient SARSA algorithm, $\Delta V(x) \le -\alpha\,\sum x_n + B$ is guaranteed under any sufficiently balanced policy and proper step-size schedule (Wu et al., 19 Mar 2025).

4. Routing Policy Synthesis and Learning Mechanisms

Policy synthesis proceeds by translating the Lyapunov structure into an actionable routing rule:

Priority and backpressure methods: Routing actions select next hops or service resources to maximize decrease in $V(x)$ , generalizing classical backpressure policies. The gradient $\nabla V(x)$ informs which node or path should be prioritized (0908.1273).
Generalized shortest-path (GSP) policies: In SOSD networks, path selection uses a bottleneck-aware max-linear function weighted by service rates, $\lambda_p(x) = \bar\lambda\,1\{p \in \arg\min_q \gamma_q(B)Q_q(x)\}$ (Wu et al., 27 Aug 2024).
Online learning/POLICY ITERATION (PI): Parameters of the piecewise-linear or affine Lyapunov proxies ( $\beta, \gamma$ ) are iteratively tuned via simulated sample paths and constrained least squares, ensuring each policy update maintains admission in the Foster–Lyapunov stable region (Wu et al., 27 Aug 2024).
SARSA and stochastic approximation: Approximate value functions $Q_w(x,a)$ are updated online using semi-gradient methods; stability and convergence of both traffic state and weights are jointly guaranteed via Lyapunov and ODE analysis (Wu et al., 19 Mar 2025).

5. Application to Resource-Constrained and Heterogeneous Settings

The Lyapunov-based token routing paradigm adapts naturally to distributed resource settings:

Stable-MoE for Distributed Mixture-of-Experts: A unified Lyapunov function on both token and energy queues enables joint optimization of routing decisions and processor frequencies, converting an infinite-horizon stochastic program into a per-slot tractable subproblem, e.g.:

$\max_{x_{i,j}, f_j} V \left[\sum_{j=1}^J\log(1+d_j^{\mathrm{com}}) + \mu \sum_{i, j} g_{i,j}x_{i,j}\right] - \sum_j Q_j[d_j^{\mathrm{rou}}-d_j^{\mathrm{com}}] - \sum_j Z_j[E_j^{\mathrm{com}}-E_j^{\mathrm{avg}}]$

The solution, via mixed-integer programming, yields both throughput-competitive and stability-guaranteed routing in highly heterogeneous edge networks, as confirmed by the queue-statistics and utility-optimality bounds (Shi et al., 7 Dec 2025).

Wireless and Multi-hop Networks: Piecewise-quadratic Lyapunov constructions enable “distance-aware” routing, where packets are steered towards short paths by dominating Lyapunov cones, but shift to alternative routes as congestion warrants; this framework recovers classical backpressure and proves throughput-optimality for opportunistic/congestion-diversity policies (0908.1273).

6. Stability Theorems and Performance Metrics

The defining strength of the Lyapunov-based approach is the formal guarantee of strong stability:

Positive Harris recurrence: The process is stable for all load parameters interior to the min-cut capacity region; expected queue length (token/energy) is uniformly bounded (Wu et al., 27 Aug 2024, Wu et al., 19 Mar 2025, 0908.1273, Shi et al., 7 Dec 2025).
Optimality-utility tradeoff: Gap to the best achievable utility is $O(1/V)$ , while average queue size grows as $O(V)$ , as in the classic drift-plus-penalty result for Lyapunov optimization (Shi et al., 7 Dec 2025).
Empirical efficiency: In practical simulations, Lyapunov-driven PI and SARSA-type methods exhibit order-of-magnitude gains in convergence speed and computational cost over deep neural RL baselines, with only a marginal gap in actual cost or throughput (Wu et al., 27 Aug 2024, Wu et al., 19 Mar 2025, Shi et al., 7 Dec 2025).

7. Comparative Summary and Contextual Impact

A selection of representative frameworks is summarized below:

System/Setting	Lyapunov Function	Routing/Control Rule
SOSD Network (Wu et al., 27 Aug 2024)	$\sum_p [Q_p(x)]^2$ (PL costs)	GSP with bottleneck-score minimization; PI over $(\beta,\gamma)$
Parallel Servers (Wu et al., 19 Mar 2025)	$\sum_n x_n^2$	Semi-gradient SARSA, linear value approximation
Multi-hop Wireless (0908.1273)	Piecewise-quad. $V(\bQ)=L^*_f(Q)$	Priority via $\nabla V$ , recovers/backpressure, ORCD
Distributed MoE (Shi et al., 7 Dec 2025)	$L(t) = \frac12 \sum_j(Q_j^2 + Z_j^2)$	Drift-plus-penalty; per-slot MIP routing + resource allocation

The pervasive impact is the ability to synthesize routing/control policies that are provably robust to stochastic disturbances, rapidly learn near-optimal performance on-line, and can accommodate complex constraints (e.g., energy, path diversity, resource heterogeneity), all within a principled, computationally efficient paradigm. The Lyapunov-based token routing framework thus underpins a broad class of modern networked systems, from wireless data planes to distributed training architectures, enabling stability-centric optimization that is both theoretically sound and practically implementable.