Sinkhorn MPC: Real-Time Transport Control
- Sinkhorn MPC is a real-time control strategy that combines receding-horizon MPC with entropic optimal transport to steer a finite population of linear agents from an initial to a target distribution.
- It employs iterative Sinkhorn updates to efficiently solve large-scale assignment problems by relaxing hard permutation constraints into smooth, tractable couplings.
- The framework ensures convergence and stability by utilizing warm-starting, tailored horizon lengths, and careful tuning of regularization and Sinkhorn iteration counts.
Sinkhorn Model Predictive Control (Sinkhorn MPC) is a real-time dynamical transport algorithm for steering a finite population of agents, each governed by linear dynamics, from an initial empirical distribution to a desired target distribution. It synthesizes model predictive control (MPC) for receding-horizon trajectory optimization with entropic-regularized optimal transport, specifically using the Sinkhorn algorithm for tractable real-time assignment. The central innovation is the simultaneous integration of transport planning (target assignment) and control, enabling cost-efficient, scalable population steering under system and input constraints (Ito et al., 2023, Ito et al., 2022).
1. Mathematical Formulation and Problem Structure
Consider agents, each with continuous or discrete linear time-invariant dynamics: with state and control . The control objective is to assign each agent to a distinct target (permutation ), minimizing the infinite-horizon sum of per-agent costs: Here is a nonnegative, equilibrium-detecting running cost. This direct problem is computationally intractable due to the factorial-size assignment space and coupled infinite-horizon optimal control subproblems [(Ito et al., 2023), Eqns. (3)-(8)].
Sinkhorn MPC replaces the infinite-horizon problem with a finite-horizon receding surrogate. At each MPC time step, agents solve for cost (over horizon with terminal constraint 0) and only the first input 1 is applied. Assignment is relaxed to a soft coupling 2, minimizing a linear combination of assignment costs subject to marginal constraints: 3 with 4 and 5 the uniform doubly-stochastic polytope. The entropic regularization parameter 6 controls the smoothness of 7 and interpolates between soft and hard permutations [(Ito et al., 2023), Eq. (22)].
2. Sinkhorn Iterations and Numerical Realization
The entropic-regularized optimal transport problem is efficiently solved via Sinkhorn's matrix-scaling algorithm. The key steps involve computing the Gibbs kernel: 8 and then iteratively updating nonnegative scaling vectors 9 to satisfy row/column marginals: 0 Here “1” denotes element-wise division. The soft coupling is 2. In practical MPC integration, only a fixed (often small) number of Sinkhorn updates 3 are executed per time-step, making the approach compatible with real-time constraints and large agent populations [(Ito et al., 2022), (9); (Ito et al., 2023), Eqns. (SMPC_sink) and Algorithm 1].
3. Receding-Horizon Sinkhorn MPC Closed-Loop Dynamics
At each discrete time 4:
- States 5 observed; finite-horizon costs 6 formed by solving 7 QPs (parallelizable).
- Gibbs kernel 8 computed.
- 9 Sinkhorn iterations update scaling vectors; coupling 0 approximated as above.
- Each agent forms a barycentric target via
1
- Individual MPCs steer each agent towards 2; plant updated to 3.
A distinguishing implementation feature is warm-starting the 4 scaling vector, significantly accelerating Sinkhorn convergence after the first iteration [(Ito et al., 2023), Algorithm 1; Fig. 5].
4. Theoretical Analysis: Convergence, Stability, and Boundedness
4.1. Existence and Nature of Equilibria
Under continuity and differentiability assumptions on the cost and navigator functions, a fixed point 5 exists such that 6, where 7 is the equilibrium coupling [(Ito et al., 2023), Proposition 1].
4.2. Global Convergence under Exact Coupling
If the exact optimal coupling is computed at each step (i.e., enough Sinkhorn iterations to convergence), continuous-time Sinkhorn MPC converges to the equilibrium set under strict positivity of the running cost and technical regularity conditions. The proof uses the entropic-OT cost as a Lyapunov function and applies LaSalle’s principle [(Ito et al., 2023), Theorem 1].
4.3. Ultimate Boundedness and Discrete Realizability
In the discrete-time setting, even with a single Sinkhorn update per step, the closed-loop states are ultimately bounded. Specifically, for any 8 and for all 9,
0
where 1 is the closed-loop MPC matrix, with spectral radius 2, and 3 bounds the navigator range [(Ito et al., 2023), Prop. 3; (Ito et al., 2022), Prop. 4.1].
4.4. Local Asymptotic Stability
For quadratic running cost and unconstrained/weakly-constrained systems, all isolated equilibria of the discrete Sinkhorn MPC scheme are locally asymptotically stable, both in the "over-smoothed" (4 large) and "hard-assignment" (5) regimes [(Ito et al., 2023), Theorem 2; (Ito et al., 2022), 4.2]. The construction of a composite Lyapunov function combining state error and Hilbert projective metric on the Sinkhorn scaling vectors supports this claim.
5. Computational Complexity, Algorithmic Steps, and Scalability
The primary computational costs per step are:
- Formulation of the 6 cost matrix 7 (parallelizable over agents): 8.
- Sinkhorn iterations: two matrix-vector products per iteration, 9.
- 0 independent MPC optimizations (closed-form in linear-quadratic settings, 1 otherwise).
A single Sinkhorn iteration for 2 costs 30.006 ms, compared to 40.1 s for the Hungarian method. For 5: Sinkhorn at 0.006 ms, 0.03 ms, 0.08 ms; full LP at 0.12 s, 6.4 s, 66 s, respectively, illustrating the scalability of Sinkhorn MPC. Warm-starting further decreases per-step Sinkhorn iteration count (from 6520 to 7100 after first step for 8) [(Ito et al., 2023), Table 1, Figs. 3–5; (Ito et al., 2022), section 6].
6. Influence of Parameters and Practical Guidelines
- Horizon 9: Should exceed the mixing time required for individual agents' MPC trajectories to nearly reach their (temporary) targets.
- Entropic regularization 0: Large values favor smooth, distributed assignments and improve transient response, but increase steady-state bias. Small 1 sharpens the assignment towards combinatorial (permutation) solutions, at the cost of increased numerical iterations and possible transient artifacts.
- Sinkhorn step count 2: One iteration yields higher energy and lower quality transients. 3 or more closely approximates unregularized assignment solutions.
- Cost shaping: Quadratic 4 enables closed-form MPC updates; non-quadratic and input-constrained cases remain tractable but require additional QP or LP solution per agent [(Ito et al., 2023), Section 5; (Ito et al., 2022), Section 6].
7. Numerical Examples and Benchmark Results
- Continuous linear agents (quadratic cost): For 5, 6, 7, the method achieves rapid convergence (smooth trajectories), with per-step Sinkhorn costs as noted above.
- Effect of 8 and 9: 0 increases transient energy; 1 nearly matches ideal assignment performance; lower 2 sharpens the steady-state but slows convergence.
- Non-quadratic costs & input constraints: For 3 subject to 4, Sinkhorn MPC executes successfully with 5, 6 for 7, exhibiting sparse control and accurate terminal distribution [(Ito et al., 2023), Figs. 6–8].
References:
- "Entropic Model Predictive Optimal Transport over Dynamical Systems" (Ito et al., 2023).
- "Sinkhorn MPC: Model predictive optimal transport over dynamical systems" (Ito et al., 2022).