- The paper establishes the weak convergence of empirical EOT kernel functionals, enabling uniform inference via Gaussian process limits.
- It leverages Hadamard differentiability and bootstrap techniques to provide uniform confidence bands for colocalization curves in high-dimensional data.
- Numerical experiments confirm robust performance in simulation and super-resolution microscopy, demonstrating scalability and statistical accuracy.
Distributional Convergence of Empirical Entropic Optimal Transport: Theory, Inference, and Statistical Applications
Introduction
This paper establishes the weak convergence theory for empirical Entropic Optimal Transport (EOT) kernel functionals, with particular emphasis on EOT-based colocalization curves. The analysis covers both the limiting distribution and the construction of asymptotic uniform confidence bands, with the theoretical framework rooted in Hadamard differentiability and the extended delta method. The work extends existing literature by covering rich classes of ground spaces and kernels, providing uniform—in contrast to merely pointwise—statistical inference. Core applications are demonstrated in both simulation paradigms and the high-throughput analysis of mitochondrial protein colocalization in super-resolution microscopy images.
Theoretical Framework
Formally, given a Polish space X, probability measures μ,ν on X, measurable cost c:X×X→R≥0, and regularization parameter λ>0, the EOT plan πμ,νλ is the unique minimizer of
EOTcλ(μ,ν)=π∈Π(μ,ν)min(∫c(x,y)dπ(x,y)+λKL(π∣μ⊗ν)),
where KL denotes Kullback–Leibler divergence. The solution can be characterized via dual entropic Kantorovich potentials fμ,νλ,gμ,νλ as in the standard entropic OT theory.
The main statistical object is the EOT kernel functional
Φ(μ,ν)(u)=∫u(x,y)dπμ,νλ(x,y),
with empirical instantiations constructed from independent samples by plugging in empirical measures and computing the corresponding EOT plan. The key case is the colocalization curve μ,ν0, integrating the EOT plan across the sublevel sets μ,ν1.
The authors provide a rigorous treatment of the weak convergence of μ,ν2 (and in particular, μ,ν3), suitably normalized, toward Gaussian processes indexed by the relevant kernels. The proof strategy leverages Hadamard differentiability for functionals over measures and the empirical process machinery, capturing interacting sampling variation from both marginals.
A central contribution is the theoretical justification of uniform confidence bands, as opposed to merely pointwise intervals, for functionals of the EOT plan. Specifically, for the colocalization curve, bands of the form
μ,ν4
are shown to be asymptotically valid at level μ,ν5, where μ,ν6 is the μ,ν7 quantile of the Euclidean norm of the Hadamard derivative applied to a suitably coupled Brownian bridge process.
Estimating these quantiles analytically is generally intractable, so the authors establish bootstrap consistency for the empirical EOT process. The bootstrap replicates are generated via resampling from the empirical measures, computing the plug-in EOT plans, and forming the empirical distribution of the supremum norm of the difference between bootstrap and original EOT kernel functionals. This provides a fully practical pipeline for EOT-based statistical inference with guaranteed asymptotic validity.
Figure 1: Bootstrap Q-Q plots validate the limit law for the empirical EOT curve, showing strong agreement for varying sample sizes.
Figure 2: Uniform bootstrap confidence bands for EOT colocalization curves demonstrate accurate coverage for multimodal transport problems.
Numerical Experiments in Controlled Settings
The methodology is substantiated through thorough simulation studies. In a Gaussian-to-Gaussian-mixture transport problem (Scenario I), the EOT colocalization curve exhibits clear multiscale structure reflecting target mixture components at different transport costs, with confidence bands appropriately concentrating as sample size increases. Quantitative assessment via Q-Q plots and band coverage verifies both the bootstrap validity and the informativeness of the uniform bands.
Extension to the sphere (μ,ν8) is explored via source von Mises–Fisher to mixture distributions (Scenario II), with geodesic cost. Again, the empirical bands reliably capture the principal modes of transport at different angular thresholds, and statistical accuracy of the uniform bands is confirmed visually and numerically.
Figure 3: Visualization of point clouds from von Mises–Fisher distributions and mixtures, highlighting structured transport.
Figure 4: Q-Q plots for bootstrap versus true EOT process errors on the sphere reveal accurate calibration.
Figure 5: Empirical EOT colocalization curves with bootstrap bands reliably capture all main transport scales for spherical mixture transport.
Colocalization in Super-Resolution Microscopy
A further section of the study applies the theory to the quantification of protein proximity in STED super-resolution images from the Optimal Transport Colocalization (OTC) dataset. In this context, fluorophore intensity images are treated as discrete measures over the pixel grid, and transport cost is given by pixel-wise Euclidean distance.
The EOT colocalization curve is employed to measure spatial matching between protein assemblies (Tom20 and Mic60). The authors address both computational and inferential bottlenecks intrinsic to high-dimensional imaging data. Efficient implementation is achieved via GPU-accelerated Sinkhorn solvers and subsampling strategies, while validation shows that, for suitable regularization parameters, EOT-based bands cover even the unregularized optimal colocalization curves.

Figure 6: STED microscopy section images illustrate the input data for EOT-based colocalization, with confidence bands displayed below.
Figure 7: Demonstration of the impact of regularization μ,ν9—as X0 decreases, EOT curves converge toward OT, and bands consistently cover the unregularized solution.
Figure 8: Full-size STED nanoscopy images for both proteins are depicted, with zoom-in boxes corresponding to analysis regions.
Figure 9: The empirical EOT colocalization curve (with uniform bands) computed from subsampled pixel intensities remains stable and accurate with respect to the full-resolution image.
Computational Scalability and Statistical Guarantees
The methodology remains computationally tractable even in high-dimensional regimes due to two main features: (1) the computational efficiency of entropic regularization and Sinkhorn's algorithm; (2) the ability to rely on resampling (subsampling) both for computational savings and for valid inference. The theoretical framework carefully justifies the stability and accuracy of the inference under such subsampled schemes.
The results show that, with appropriately tuned regularization, the empirical colocalization curves and their confidence bands are robust—even for orders of magnitude reduction in computation time—and the coverage probability of the uniform bands remains close to nominal.
Conclusion
This work rigorously characterizes the weak convergence of empirical EOT kernel functionals, enabling the first construction of uniform confidence bands for EOT-based statistics such as colocalization curves. The paper demonstrates the efficacy, robustness, and practical feasibility of these results in simulation and in imaging applications, leveraging both theoretical advances (Hadamard differentiability, empirical process techniques) and computational methodologies (GPU-accelerated Sinkhorn, subsampling, bootstrap). This framework opens the door to routine, statistically sound, and computationally efficient EOT-based inference for a variety of structured data analysis problems, particularly in the presence of complex high-dimensional distributions.