Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On Partial Optimal Transport: Revising the Infeasibility of Sinkhorn and Efficient Gradient Methods (2312.13970v2)

Published 21 Dec 2023 in cs.LG, cs.AI, and math.OC

Abstract: This paper studies the Partial Optimal Transport (POT) problem between two unbalanced measures with at most $n$ supports and its applications in various AI tasks such as color transfer or domain adaptation. There is hence the need for fast approximations of POT with increasingly large problem sizes in arising applications. We first theoretically and experimentally investigate the infeasibility of the state-of-the-art Sinkhorn algorithm for POT due to its incompatible rounding procedure, which consequently degrades its qualitative performance in real world applications like point-cloud registration. To this end, we propose a novel rounding algorithm for POT, and then provide a feasible Sinkhorn procedure with a revised computation complexity of $\mathcal{\widetilde O}(n2/\varepsilon4)$. Our rounding algorithm also permits the development of two first-order methods to approximate the POT problem. The first algorithm, Adaptive Primal-Dual Accelerated Gradient Descent (APDAGD), finds an $\varepsilon$-approximate solution to the POT problem in $\mathcal{\widetilde O}(n{2.5}/\varepsilon)$, which is better in $\varepsilon$ than revised Sinkhorn. The second method, Dual Extrapolation, achieves the computation complexity of $\mathcal{\widetilde O}(n2/\varepsilon)$, thereby being the best in the literature. We further demonstrate the flexibility of POT compared to standard OT as well as the practicality of our algorithms on real applications where two marginal distributions are unbalanced.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. In Advances in Neural Information Processing Systems, 1964–1974.
  2. Robust optimal transport with applications in generative modeling and domain adaptation. Advances in Neural Information Processing Systems, 33: 12934–12944.
  3. Iterative Bregman projections for regularized transportation problems. SIAM Journal on Scientific Computing, 37(2): A1111–A1138.
  4. Smooth and Sparse Optimal Transport. In AISTATS, 880–889.
  5. Smooth and sparse optimal transport. In International conference on artificial intelligence and statistics, 880–889. PMLR.
  6. SPOT: Sliced Partial Optimal Transport. ACM Transactions on Graphics.
  7. Sliced and radon wasserstein barycenters of measures. Journal of Mathematical Imaging and Vision, 51(1): 22–45.
  8. Free boundaries in optimal transport and Monge-Ampère obstacle problems. Annals of Mathematics, 171(2): 673–730.
  9. Partial Optimal Tranport with applications on Positive-Unlabeled Learning. In Advances in Neural Information Processing Systems 33.
  10. Unbalanced Optimal Transport: Dynamic and Kantorovich Formulation.
  11. Scaling Algorithms for Unbalanced Transport Problems. arXiv:1607.05816.
  12. Joint distribution optimal transportation for domain adaptation. Advances in neural information processing systems, 30.
  13. Optimal Transport for Domain Adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(9): 1853–1865.
  14. Cuturi, M. 2013. Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems, 2292–2300.
  15. A smoothed dual approach for variational Wasserstein problems. SIAM Journal on Imaging Sciences, 9(1): 320–343.
  16. Quasi-Newton methods, motivation and theory. SIAM review, 19(1): 46–89.
  17. Improved Complexity Bounds in Wasserstein Barycenter Problem. In Banerjee, A.; and Fukumizu, K., eds., Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, volume 130 of Proceedings of Machine Learning Research, 1738–1746. PMLR.
  18. Computational Optimal Transport: Complexity by Accelerated Gradient Descent Is Better Than by Sinkhorn’s Algorithm. In International conference on machine learning, 1367–1376.
  19. Figalli, A. 2010. The optimal partial transport problem. Archive for rational mechanics and analysis, 195(2): 533–560.
  20. Fast Optimal Transport Averaging of Neuroimaging Data. CoRR, abs/1503.08596.
  21. Accelerated Alternating Minimization, Accelerated Sinkhorn’s Algorithm and Accelerated Iterative Bregman Projections. arXiv:1906.03622.
  22. A Direct O~⁢(1/ε)~𝑂1𝜀\widetilde{O}(1/\varepsilon)over~ start_ARG italic_O end_ARG ( 1 / italic_ε ) Iteration Parallel Algorithm for Optimal Transport. ArXiv Preprint: 1906.00618.
  23. Wasserstein Regularization for Sparse Multi-task Regression. In AISTATS.
  24. Kantorovich, L. V. 1942. On the translocation of masses. In Dokl. Akad. Nauk. USSR (NS), volume 37, 199–201.
  25. Partial Wasserstein Covering. CoRR, abs/2106.00886.
  26. Learning multiple layers of features from tiny images. Technical Report TR-2009.
  27. Graph topology invariant gradient and sampling complexity for decentralized and stochastic optimization. arXiv:2101.00143.
  28. On Multimarginal Partial Optimal Transport: Equivalent Forms and Computational Complexity.
  29. Optimal entropy-transport problemsand a new Hellinger–Kantorovich distance between positive measures. Inventiones Mathematicae, 211: 969–1117.
  30. On Efficient Optimal Transport: An Analysis of Greedy and Accelerated Mirror Descent Algorithms. In International Conference on Machine Learning, 3982–3991.
  31. Partial Gromov-Wasserstein Learning for Partial Graph Matching. CoRR, abs/2012.01252.
  32. Nesterov, Y. 2005. Smooth minimization of non-smooth functions. Mathematical programming, 103(1): 127–152.
  33. Nesterov, Y. 2007. Dual extrapolation and its applications to solving variational inequalities and related problems. Mathematical Programming, 109(2-3): 319–344.
  34. Stochastic Approximation for Nonlinear Discrete Stochastic Control: Finite-Sample Bounds. arXiv:2304.11854.
  35. Improving mini-batch optimal transport via partial transportation. In International Conference on Machine Learning, 16656–16690. PMLR.
  36. On Unbalanced Optimal Transport: Gradient Methods, Sparsity and Approximation Error.
  37. Robust Estimation under the Wasserstein Distance. arXiv preprint arXiv:2302.01237.
  38. A Linear Time Histogram Metric for Improved SIFT Matching. In European Conference on Computer Vision.
  39. On Unbalanced Optimal Transport: An Analysis of Sinkhorn Algorithm. In ICML.
  40. Polyak, B. T. 1969. The conjugate gradient method in extremal problems. USSR Computational Mathematics and Mathematical Physics, 9(4): 94–112.
  41. Rigid Registration of Point Clouds Based on Partial Optimal Transport. In Computer Graphics Forum, volume 41, 365–378. Wiley Online Library.
  42. Fast Dictionary Learning with a Smoothed Wasserstein Loss. In Gretton, A.; and Robert, C. C., eds., Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, volume 51 of Proceedings of Machine Learning Research, 630–638. Cadiz, Spain: PMLR.
  43. The earth mover’s distance as a metric for image retrieval. International journal of computer vision, 40(2): 99–121.
  44. SuperGlue: Learning Feature Matching with Graph Neural Networks. CoRR, abs/1911.11763.
  45. Sherman, J. 2017. Area-convexity, ℓ∞subscriptℓ\ell_{\infty}roman_ℓ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT regularization, and undirected multicommodity flow. In STOC, 452–460. ACM.
  46. Villani, C. 2008. Optimal transport: Old and New. Springer.
  47. Partial Wasserstein Adversarial Network for Non-rigid Point Set Registration. In International Conference on Learning Representations.
  48. Scalable Unbalanced Optimal Transport using Generative Adversarial Networks. In ICLR.
Citations (2)

Summary

We haven't generated a summary for this paper yet.