Cross-modality Matching and Prediction of Perturbation Responses with Labeled Gromov-Wasserstein Optimal Transport (2405.00838v3)
Abstract: It is now possible to conduct large scale perturbation screens with complex readout modalities, such as different molecular profiles or high content cell images. While these open the way for systematic dissection of causal cell circuits, integrated such data across screens to maximize our ability to predict circuits poses substantial computational challenges, which have not been addressed. Here, we extend two Gromov-Wasserstein Optimal Transport methods to incorporate the perturbation label for cross-modality alignment. The obtained alignment is then employed to train a predictive model that estimates cellular responses to perturbations observed with only one measurement modality. We validate our method for the tasks of cross-modality alignment and cross-modality prediction in a recent multi-modal single-cell perturbation dataset. Our approach opens the way to unified causal models of cell biology.
- Geometric dataset distances via optimal transport. In Advances in Neural Information Processing Systems, volume 33, pp. 21428–21439, 2020.
- Gromov-Wasserstein alignment of word embedding spaces. In Empirical Methods in Natural Language Processing, pp. 1881–1890, 2018.
- Structured optimal transport. In Artificial Intelligence and Statistics, volume 84, pp. 1771–1780, 2018.
- MultiVI: deep generative model for the integration of multimodal data. Nature Methods, 20(8):1222–1231, 2023.
- High-content CRISPR screening. Nature Reviews Methods Primers, 2(1):8, 2022.
- Learning single-cell perturbation responses using neural optimal transport. Nature Methods, 20(11):1759–1768, 2023.
- InfoOT: Information maximizing optimal transport. In International Conference on Machine Learning, volume 202, pp. 6228–6242, 2023.
- Joint single-cell measurements of nuclear proteins and RNA in vivo. Nature Methods, 18(10):1204–1212, 2021.
- Inference of single cell profiles from histology stains with the Single-Cell omics from histology analysis framework (SCHAF). bioRxiv, 2023.
- Joint distribution optimal transportation for domain adaptation. In Advances in Neural Information Processing Systems, pp. 3733–3742, 2017.
- Cuturi, M. Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems, volume 26, 2013.
- Optimal transport tools (OTT): A JAX toolbox for all things Wasserstein, 2022.
- DeepJDOT: Deep joint distribution optimal transport for unsupervised domain adaptation. European Conference on Computer Vision, 2018.
- SCOT: Single-cell multi-omics alignment with optimal transport. Journal of Computational Biology, 29(1):3–18, 2022a.
- Jointly aligning cells and genomic features of single-cell multi-omics data with co-optimal transport. bioRxiv, pp. 2022.11.09.515883, 2022b.
- Perturb-Seq: Dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell, 167(7):1853–1866.e17, 2016.
- Optical pooled screens in human cells. Cell, 179(3):787–799.e17, 2019.
- Multimodal pooled perturb-cite-seq screens in patient models define mechanisms of cancer immune evasion. Nature Genetics, 53(3):332–341, 2021.
- Domain-adversarial training of neural networks. Journal of Machine Learning Research, 17(59):1–35, 2016.
- A python library for probabilistic analysis of single-cell omics data. Nature Biotechnology, 40(2):163–166, 2022.
- Systematically characterizing the roles of e3-ligase family members in inflammatory responses with massively parallel perturb-seq. bioRxiv, 2023. doi: 10.1101/2023.01.23.525198.
- Matching single cells across modalities with contrastive learning and optimal transport. Briefing in Bioinformatics, 24(3), 2023.
- Kantorovich, L. Mathematical methods of organizing and planning production. Management science, 6(4):366– 422, 1960.
- On convergence and stability of GANs. arXiv, 2017.
- Jointly embedding multiple Single-Cell omics measurements. In Algorithms in Bioinformatics, volume 143, 2019.
- A joint model of unpaired data from scRNA-seq and spatial transcriptomics for imputing missing gene expression measurements. ICML Workshop in Computational Biology, 2019.
- Mémoli, F. Gromov–Wasserstein distances and the metric approach to object matching. Foundations of Computational Mathematics, 11(4):417–487, 2011.
- Monge, G. Mémoire sur la théorie des déblais et des remblais. Mémoires de l’Académie royale des sciences de Paris, 1781.
- Computational Optimal Transport. Foundations and Trends in Machine Learning, 2018.
- Gromov-Wasserstein averaging of kernel and distance matrices. International Conference on Machine Learning, pp. 2664–2672, 2016.
- Co-optimal transport. Advances in Neural Information Processing Systems, 33(17559-17570):2, 2020.
- SCANPY: large-scale single-cell gene expression data analysis. Genome Biol., 19(1):15, 2018.
- Hierarchical optimal transport for comparing histopathology datasets. In Medical Imaging with Deep Learning, volume 172, pp. 1459–1469, 2022.