Adaptive Graphs via Quadratic OT

Updated 26 March 2026

Adaptive graphs via quadratic optimal transport are methods that construct networks by minimizing quadratic transport costs to optimize edge weights and topology.
They leverage differentiable optimization strategies—including dual Newton methods and Sinkhorn iterations—to dynamically adjust graph structures.
These approaches find applications in graph signal processing, mesh generation, and network alignment, offering robust, data-driven insights into complex systems.

Adaptive graphs via quadratic optimal transport (QOT) are a class of graph construction and refinement methodologies in which the underlying topology and/or edge weights of a graph are adaptively determined according to the solution of a quadratic-cost optimal transport problem. Unlike traditional nearest neighbor or kernel-based methods, QOT-based frameworks yield data-adaptive, typically sparse, and globally optimal graphs that encode geometric, probabilistic, or dynamical structure. This entry details core mathematical formulations, algorithmic strategies, notable theoretical properties, and key applications, focusing on contemporary frameworks and algorithms grounded in the quadratic OT paradigm.

1. Quadratic Optimal Transport on Graphs: Mathematical Foundations

Quadratic optimal transport provides a metric for comparing probability distributions by minimizing a quadratic (typically Euclidean) transport cost. On graphs, the quadratic OT problem is formulated as transporting mass between nodes under quadratic cost constraints, sometimes involving additional regularization or flow-based formulations.

One influential framework considers the distance between zero-mean Gaussian random vectors whose covariances encode the graph's topology via the pseudoinverse graph Laplacian. Specifically, given two graphs with Laplacians $L_1$ , $L_2$ , the 2-Wasserstein distance between their associated Gaussian signal distributions is available in closed form:

$W_2^2(\nu_1^g, \nu_2^g) = \operatorname{Tr}(L_1^\dagger) + \operatorname{Tr}(L_2^\dagger) - 2\,\operatorname{Tr}\left[ (L_1^{\dagger 1/2} L_2^\dagger L_1^{\dagger 1/2})^{1/2} \right]$

where $^\dagger$ denotes the Moore–Penrose pseudoinverse. This metric is both sensitive to global structure—unlike purely local metrics—and differentiable with respect to the Laplacian entries, enabling gradient-based graph adaptation (Maretic et al., 2019).

Alternatively, QOT on graphs is realized via minimization over edge-flows:

$\min_{F \geq 0}\; c^T F + \frac{\alpha}{2}\|F\|_2^2, \;\; \text{subject to} \;\; D F = b$

with $D$ the incidence matrix, $c$ the edge cost vector, $b$ the supply/demand vector, and $\alpha > 0$ the quadratic regularization parameter. The dual problem admits smooth structure and its optimal solution supports adaptive manipulation of the edge costs and active flows (Essid et al., 2017).

2. Adaptive Graph Construction Mechanisms

Quadratic OT provides several mechanisms to construct or update graphs adaptively:

Data-Driven Edge Weight Adaptation: Given empirical covariances or local data geometry, edge weights $w$ can be optimized to minimize a quadratic OT loss between observed signal distributions and those predicted by the graph, e.g.,

$\min_{w \geq 0}\; W_2^2(\mathcal{N}(0, \Sigma_\text{data}), \mathcal{N}(0, L(w)^\dagger)) + \lambda R(w)$

Regularization $R(w)$ encodes sparsity or degree constraints. Differentiability allows direct gradient flow for adaptive weight tuning (Maretic et al., 2019).

Topology Learning via Flow Sensitivity: The dual variables in QOT, viewed as node potentials, identify which edges are active in optimal flows. Adaptive strategies prune or augment edges by evaluating reduced costs $r_e = c_e - D_e^T y$ and re-solving the QOT as demands change (Essid et al., 2017).
Graph Refinement from OT Distances: Continuous-flow approaches define a Riemannian OT distance on the probability simplex over vertices. Discrepancies between continuous and graph-based OT distances guide edge reweighting, local refinement, and vertex insertion to better represent geometry or data manifold structure (Solomon et al., 2016).
r-adaptive Graphs via Monge–Ampère: Mapping a reference mesh to a target via the Monge–Ampère equation, with a scalar density monitor, induces an implicit Riemannian metric $M(x)$ , equidistributing resolution and anisotropically stretching elements/edges according to geometric features. The resulting mapping and triangulation constitute an adaptive graph with local metric-aligned connectivity (Budd et al., 2014).
Quadratic-Regularized Neighborhood Graphs: Adaptive, sparse neighborhood graphs arise from solving

$\min_{P \in \Pi} \langle C, P \rangle + \frac{\varepsilon}{2}\|P\|_F^2$

under doubly-stochastic and symmetry constraints. The regularization parameter $\varepsilon$ tunes sparsity, and the solution admits a closed form. Edges are declared where $P_{ij} > 0$ (Matsumoto et al., 2022).

Approach	Adaptivity Signal	Computational Form
Laplacian-based 2-Wasserstein (GOT)	Signal covariances	Matrix square roots
Edge-flow QOT (dual flows)	Node potentials	Newton, Laplacian
Continuous-flow OT	Riemannian structure	SOCP, ADMM
Monge–Ampère mesh	Density monitor $\rho(x)$	Scalar PDE, SVD
QOT neighborhood graphs	Local data geometry	Semi-smooth Newton

3. Algorithmic Strategies and Computation

Several efficient optimization strategies have been developed for quadratic OT-based adaptive graphs:

Stochastic Variational Optimization: For graph alignment under unknown node orderings, the quadratic OT cost in permutation space is relaxed through a differentiable Sinkhorn operator, with Bayesian exploration (randomized reparameterization) used to escape local minima. Convergence is ensured under Robbins–Monro conditions. Bottleneck operations include matrix square-roots and SVDs, with per-iteration complexity ranging from $O(N^3)$ to $O(N^2 K)$ after optimizations (Maretic et al., 2019).
Dual Newton Methods: For quadratically regularized network flows, the dual problem is solved via Newton's method. The Hessian is block-diagonal, corresponding to the Laplacian of the active subgraph, and per-iteration cost is dominated by sparse Laplacian linear solves, often $O(m\log n)$ (Essid et al., 2017).
SOCP and Proximal Splitting: Continuous-flow OT and time-discretized geodesic computation reduce to large SOCPs or can be made tractable via operator splitting (e.g., ADMM). This supports scaling to larger graphs (Solomon et al., 2016).
Semi-smooth Newton and Active Set Methods: For QOT-based neighborhood graphs, explicit active-set strategies in conjunction with semi-smooth Newton updates enforce doubly-stochasticity and symmetry, achieving fast convergence (few tens of Newton iterations) and handling sparsity automatically (Matsumoto et al., 2022).
Metric Learning Loops: In adaptive ground metric learning, entropic regularization and Sinkhorn iteration are embedded into gradient-based outer loops with specialized fast kernel diffusion for scalability, and reverse-mode automatic differentiation for learning edge weights (Heitz et al., 2019).

4. Theoretical Properties and Guarantees

Quadratic OT adaptive graph methods exhibit several notable theoretical properties:

Strong Convexity and Uniqueness: The quadratic regularization renders the transport problem strongly convex, with unique minimizers for the transport plan and associated graphs or weights (Matsumoto et al., 2022, Essid et al., 2017).
Riemannian Structure and Metric Validity: Continuous-flow formulations endow the simplex of distributions with a true Riemannian metric, satisfying nonnegativity, symmetry, and the triangle inequality. Graph Dirac distances are upper-bounded by graph shortest-path distances (Solomon et al., 2016).
Support Sparsity and Localization: QOT solutions naturally yield sparse support—active sets are typically $O(N)$ not $O(N^2)$ , and as regularization or perturbation horizons increase, optimal flows localize on critical graph substructures. This informs both graph refinement and aggressive graph pruning (Essid et al., 2017, Grover et al., 2016).
Differentiability for Outer Optimization: The quadratic forms ensure differentiability with respect to edge weights, Laplacians, or cost vectors, facilitating end-to-end learning and adaptation via gradient flow (Maretic et al., 2019, Heitz et al., 2019).
Complexity-Scaling: While cubic scaling is common for naive algorithms (e.g., all-pairs matrix operations), sparsity exploitation, Newton-type updates, and second-order cone relaxations enable substantial acceleration. Notably, for QOT neighborhood graphs, empirical convergence is rapid and solution support is "local" even for modest regularization (Matsumoto et al., 2022).

5. Applications Across Domains

Quadratic OT adaptive graph frameworks are deployed for a broad spectrum of applications:

Graph Alignment and De-anonymization: The GOT framework computes graph alignment via quadratic OT permutation minimization, showing efficacy in aligning social, biological, and sensor networks (Maretic et al., 2019).
Graph Signal Processing and Learning: QOT distances between Laplacian-induced Gaussian priors serve as kernels for graph classification, clustering, and graph signal prediction (Maretic et al., 2019).
Neighborhood Graphs for Machine Learning: Adaptive QOT-based neighborhood graphs outperform $k$ -NN and entropic OT graphs in manifold learning (eigenspace recovery), semi-supervised label propagation, and single-cell RNA-seq denoising. Robustness to density/heterogeneity and global adaptivity are empirically established (Matsumoto et al., 2022).
Mesh Generation and Finite Element Methods: OT-based mesh adaptation via the Monge–Ampère PDE yields $r$ -adaptive meshes (and thus graphs) with analytically characterized anisotropy and equidistribution, verified in steady-state and dynamic PDEs (Budd et al., 2014).
Control, Mixing, and System Perturbation: Convex, globally optimal graph-based QOT strategies underpin optimal control of nonlinear dynamical systems, finite-time mixing with minimal energy, and characterization of system transport barriers (Grover et al., 2016, Elamvazhuthi et al., 2016).
Learning Adaptive Metrics for OT: Given dynamic histogram data, adaptive learning of graph ground metrics via quadratic OT produces geodesic interpolations that align with observed data and outperforms baselines in color palette interpolation for video synthesis and synthetic geometric setups (Heitz et al., 2019).

6. Limitations and Practical Considerations

Despite the expressive power of QOT-based adaptive graphs, key limitations remain:

Scalability: Per-iteration complexity can be prohibitive for very large graphs ( $O(N^3)$ for naive approaches), although sparsity and efficient solvers mitigate this to an extent (Maretic et al., 2019, Essid et al., 2017, Matsumoto et al., 2022).
Nonconvexity in Alignment and Metric Learning: Alignment over permutations is fundamentally nonconvex: results depend on initialization, annealing strategies, and regularization. Likewise, learning edge weights for adaptive metrics is nonconvex and may yield only stationary points (Maretic et al., 2019, Heitz et al., 2019).
Graph Size and Topology Requirements: Classical quadratic OT formulations typically require graphs of the same size; extensions to cardinality-mismatched graphs are nontrivial and remain an active research direction (Maretic et al., 2019).
Dependence on Laplacian Invertibility: The reliance on Laplacian pseudoinverses imposes technical burdens for disconnected or nearly singular graphs, necessitating regularization or small diagonal shifts (Maretic et al., 2019).
Parameter Selection: While QOT offers improved robustness versus $k$ -NN in neighborhood graphs, the tuning of the regularization parameter $\varepsilon$ is still necessary, though grid search is tractable and universal across multiple tasks (Matsumoto et al., 2022).

7. Connections and Outlook

Adaptive graphs via quadratic optimal transport unify and extend concepts from algebraic graph theory, geometric learning, dynamical systems, and optimal control. The differentiable, convex, and Riemannian structures inherent in QOT-based formulations permit principled adaptation of both weights and topology, yielding sparse, data- or dynamics-adaptive graphs naturally aligning with complex underlying structure. The interplay of probabilistic signal modeling, transport-based comparison, metric learning, and global optimization constitutes a modern synthesis for robust graph construction and inference (Maretic et al., 2019, Matsumoto et al., 2022, Essid et al., 2017, Solomon et al., 2016, Budd et al., 2014, Grover et al., 2016, Heitz et al., 2019).

Future research will likely address scaling limits, nonconvex optimization guarantees, extensions to dynamic and non-square graphs, and further integration into end-to-end machine learning and control pipelines, building on the foundational mathematical and empirical properties surveyed herein.