Consistent Optimal Transport with Empirical Conditional Measures (2305.15901v6)
Abstract: Given samples from two joint distributions, we consider the problem of Optimal Transportation (OT) between them when conditioned on a common variable. We focus on the general setting where the conditioned variable may be continuous, and the marginals of this variable in the two joint distributions may not be the same. In such settings, standard OT variants cannot be employed, and novel estimation techniques are necessary. Since the main challenge is that the conditional distributions are not explicitly available, the key idea in our OT formulation is to employ kernelized-least-squares terms computed over the joint samples, which implicitly match the transport plan's marginals with the empirical conditionals. Under mild conditions, we prove that our estimated transport plans, as a function of the conditioned variable, are asymptotically optimal. For finite samples, we show that the deviation in terms of our regularized objective is bounded by $O(1/m{1/4})$, where $m$ is the number of samples. We also discuss how the conditional transport plan could be modelled using explicit probabilistic models as well as using implicit generative ones. We empirically verify the consistency of our estimator on synthetic datasets, where the optimal plan is analytically known. When employed in applications like prompt learning for few-shot classification and conditional-generation in the context of predicting cell responses to treatment, our methodology improves upon state-of-the-art methods.
- Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5:135–146.
- Supervised training of conditional monge maps. In NeurIPS.
- Learning single-cell perturbation responses using neural optimal transport. bioRxiv.
- Learning single-cell perturbation responses using neural optimal transport. Nature Methods.
- Otkge: Multi-modal knowledge graph embeddings via optimal transport. In NeurIPS.
- Prompt learning with optimal transport for vision-language models. In ICLR.
- Unbalanced minibatch optimal transport; applications to domain adaptation. In ICML.
- Learning with minibatch wasserstein: asymptotic and gradient properties. In AISTATS.
- Learning with a wasserstein loss. In NIPS.
- Obtaining fairness using optimal transport theory. In Chaudhuri, K. and Salakhutdinov, R., editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 2357–2365. PMLR.
- Minimax risk and uniform convergence rates for nonparametric dyadic regression. NBER Working Paper Series.
- Conditional mean embeddings as regressors. In ICML.
- Atlantic causal inference conference (ACIC) data analysis challenge 2017.
- Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(7):2217–2226.
- Efficient robust optimal transport with application to multi-label classification. In IEEE Conference on Decision and Control (CDC).
- Kantorovich, L. (1942). On the transfer of masses (in russian). Doklady Akademii Nauk, 37(2):227–229.
- Universal Approximation with Deep Narrow Networks. In ICML.
- Neural optimal transport. In ICLR.
- Learning multiple layers of features from tiny images.
- Learning to detect unseen object classes by between-class attribute transfer. In CVPR.
- MNIST handwritten digit database.
- Minimax optimal conditional density estimation under total variation smoothness. Electronic Journal of Statistics, 16(2):3937 – 3972.
- Wasserstein generative learning of conditional distribution. ArXiv.
- Semantic correspondence as an optimal transport problem. In CVPR.
- Conditional bures metric for domain adaptation. In CVPR.
- Maurer, A. (2016). A vector-contraction inequality for rademacher complexities. In ALT.
- Kernel mean embedding of distributions: A review and beyond. Foundations and Trends® in Machine Learning, 10(1-2):1–141.
- Neyshabur, B. (2017). Implicit regularization in deep learning.
- Computational optimal transport. Foundations and Trends® in Machine Learning, 11(5-6):355–607.
- Learning transferable visual models from natural language supervision. In ICML.
- Hilbert space embeddings of conditional distributions with applications to dynamical systems. In ICML.
- Universality, characteristic kernels and RKHS embedding of measures. Journal of Machine Learning Research, 12:2389–2410.
- Drug and disease signature integration identifies synergistic combinations in glioblastoma. Nature Communications, 9.
- Unbalanced optimal transport meets sliced-wasserstein.
- Sinkhorn divergences for unbalanced optimal transport.
- Data driven conditional optimal transport. Machine Learning, 110(11):3135–3155.
- Training lipschitz continuous operators using reproducing kernels. In Annual Learning for Dynamics and Control Conference.
- Scanpy: large-scale single-cell gene expression data analysis. Genome Biology, 19(1):15.
- Tip-Adapter: Training-free adaption of clip for few-shot classification. In ECCV.
- Conditional prompt learning for vision-language models. In CVPR.
- Learning to prompt for vision-language models. International Journal of Computer Vision, 130(9):2337–2348.