Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 62 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 36 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 67 tok/s Pro
Kimi K2 192 tok/s Pro
GPT OSS 120B 430 tok/s Pro
Claude Sonnet 4.5 34 tok/s Pro
2000 character limit reached

Graph Integration for Diffusion-Based Manifold Alignment (2410.22978v1)

Published 30 Oct 2024 in stat.ML and cs.LG

Abstract: Data from individual observations can originate from various sources or modalities but are often intrinsically linked. Multimodal data integration can enrich information content compared to single-source data. Manifold alignment is a form of data integration that seeks a shared, underlying low-dimensional representation of multiple data sources that emphasizes similarities between alternative representations of the same entities. Semi-supervised manifold alignment relies on partially known correspondences between domains, either through shared features or through other known associations. In this paper, we introduce two semi-supervised manifold alignment methods. The first method, Shortest Paths on the Union of Domains (SPUD), forms a unified graph structure using known correspondences to establish graph edges. By learning inter-domain geodesic distances, SPUD creates a global, multi-domain structure. The second method, MASH (Manifold Alignment via Stochastic Hopping), learns local geometry within each domain and forms a joint diffusion operator using known correspondences to iteratively learn new inter-domain correspondences through a random-walk approach. Through the diffusion process, MASH forms a coupling matrix that links heterogeneous domains into a unified structure. We compare SPUD and MASH with existing semi-supervised manifold alignment methods and show that they outperform competing methods in aligning true correspondences and cross-domain classification. In addition, we show how these methods can be applied to transfer label information between domains.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. K. M. Boehm, E. A. Aherne, L. Ellenson et al., “Multimodal data integration using machine learning improves risk stratification of high-grade serous ovarian cancer,” Nature Cancer, vol. 3, no. 6, pp. 723–733, Jun 2022. [Online]. Available: https://doi.org/10.1038/s43018-022-00388-9
  2. P. Koehn, “Europarl: A parallel corpus for statistical machine translation,” in Proceedings of Machine Translation Summit X: Papers, Phuket, Thailand, Sep. 13-15 2005, pp. 79–86. [Online]. Available: https://aclanthology.org/2005.mtsummit-papers.11
  3. T. Meng, X. Jing, Z. Yan et al., “A survey on machine learning for data fusion,” Information Fusion, vol. 57, pp. 115–129, 2020.
  4. A. J. Izenman, “Introduction to manifold learning,” Wiley Interdisciplinary Reviews: Computational Statistics, vol. 4, no. 5, pp. 439–446, 2012.
  5. J. D. Welch, A. J. Hartemink, and J. F. Prins, “Matcher: manifold alignment reveals correspondence between single cell transcriptome and epigenome dynamics,” Genome Biology, vol. 18, no. 1, p. 138, Jul 2017. [Online]. Available: https://doi.org/10.1186/s13059-017-1269-0
  6. M. Amodio and S. Krishnaswamy, “Magan: Aligning biological manifolds,” in International Conference on Machine Learning, 2018. [Online]. Available: https://api.semanticscholar.org/CorpusID:3303339
  7. S. Lafon, Y. Keller, and R. Coifman, “Data fusion and multicue data matching by diffusion maps,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 11, pp. 1784–1797, 2006.
  8. A. Nguyen, L. E. Richards, G. Y. Kebe et al., “Practical cross-modal manifold alignment for grounded language,” ArXiv, vol. abs/2009.05147, 2020.
  9. C. Wang, P. Krafft, and S. Mahadevan, “Manifold alignment,” in Manifold Learning: Theory and Applications, Y. Ma and Y. Fu, Eds.   CRC Press, 2011.
  10. C. Wang and S. Mahadevan, “Manifold alignment without correspondence,” in International Joint Conference on Artificial Intelligence, 2009. [Online]. Available: https://api.semanticscholar.org/CorpusID:59769929
  11. Z. Cui, H. Chang, S. Shan et al., “Generalized unsupervised manifold alignment,” in Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes et al., Eds., vol. 27.   Curran Associates, Inc., 2014.
  12. J. S. Stanley, S. Gigante, G. Wolf et al., “Harmonic alignment,” in Proceedings of the 2020 SIAM International Conference on Data Mining (SDM), 2020, pp. 316–324. [Online]. Available: https://epubs.siam.org/doi/pdf/10.1137/1.9781611976236.36
  13. O. Lindenbaum, A. Yeredor, M. Salhov et al., “Multi-view diffusion maps,” Information Fusion, vol. 55, pp. 127–149, 2020.
  14. C. Wang and S. Mahadevan, “Heterogeneous domain adaptation using manifold alignment,” in International Joint Conference on Artificial Intelligence, 2011.
  15. D. Tuia and G. Camps-Valls, “Kernel manifold alignment for domain adaptation,” PLoS One, vol. 11, no. 2, p. e0148655, Feb. 2016.
  16. A. F. Duque Correa, M. Lizotte, G. Wolf et al., “Manifold alignment with label information,” in 2023 International Conference on Sampling Theory and Applications (SampTA), 2023, pp. 1–6.
  17. J. Ham, D. Lee, and L. Saul, “Semisupervised alignment of manifolds,” in Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, ser. Proceedings of Machine Learning Research, R. G. Cowell and Z. Ghahramani, Eds., vol. R5.   PMLR, 06–08 Jan 2005, pp. 120–127, reissued by PMLR on 30 March 2021. [Online]. Available: https://proceedings.mlr.press/r5/ham05a.html
  18. C. Wang and S. Mahadevan, “Manifold alignment using procrustes analysis,” in Proceedings of the 25th International Conference on Machine Learning, ser. ICML ’08.   New York, NY, USA: Association for Computing Machinery, 2008, p. 1120–1127. [Online]. Available: https://doi.org/10.1145/1390156.1390297
  19. J. B. Tenenbaum, V. Silva, and J. C. Langford, “A global geometric framework for nonlinear dimensionality reduction,” Science, vol. 290, no. 5500, pp. 2319–2323, 2000. [Online]. Available: https://doi.org/10.1126/science.290.5500.2319
  20. R. R. Coifman and S. Lafon, “Diffusion maps,” Appl. Comput. Harmon. Anal., vol. 21, no. 1, pp. 5–30, 2006, special Issue: Diffusion Maps and Wavelets. [Online]. Available: https://doi.org/10.1016/j.acha.2006.04.006
  21. M. Balasubramanian and E. L. Schwartz, “The isomap algorithm and topological stability,” Science, vol. 295, no. 5552, pp. 7–7, 2002.
  22. A. W. Fitzgibbon, “Robust registration of 2d and 3d point sets,” Image Vis. Comput., vol. 21, pp. 1145–1153, 2003. [Online]. Available: https://api.semanticscholar.org/CorpusID:7576794
  23. H. Wolfson and I. Rigoutsos, “Geometric hashing: an overview,” IEEE Computational Science and Engineering, vol. 4, no. 4, pp. 10–21, 1997.
  24. M. Belkin and P. Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation,” Neural computation, vol. 15, no. 6, pp. 1373–1396, 2003.
  25. C. Shen, J. T. Vogelstein, and C. E. Priebe, “Manifold matching using shortest-path distance and joint neighborhood selection,” Pattern Recognition Letters, vol. 92, pp. 41–48, 2017. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S016786551730106X
  26. A. F. Duque, G. Wolf, and K. R. Moon, “Diffusion transport alignment,” in Advances in Intelligent Data Analysis XXI, B. Crémilleux, S. Hess, and S. Nijssen, Eds.   Cham: Springer Nature Switzerland, 2023, pp. 116–129.
  27. N. Courty, R. Flamary, and D. Tuia, “Domain adaptation with regularized optimal transport,” in Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2014, Nancy, France, September 15-19, 2014. Proceedings, Part I 14.   Springer, 2014, pp. 274–289.
  28. K. R. Moon, D. van Dijk et al., “Visualizing structure and transitions in high-dimensional biological data,” Nat. Biotechnol., vol. 37, no. 12, pp. 1482–1492, Dec 2019. [Online]. Available: https://doi.org/10.1038/s41587-019-0336-3
  29. P. Demetci, R. Santorella, B. Sandstede et al., “Scot: single-cell multi-omics alignment with optimal transport,” Journal of computational biology, vol. 29, no. 1, pp. 3–18, 2022.
  30. A. F. Duque, G. Wolf, and K. R. Moon, “Visualizing high dimensional dynamical processes,” in 2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP), 2019, pp. 1–6.
  31. J. B. Kruskal, “Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis,” Psychometrika, vol. 29, no. 1, pp. 1–27, Mar 1964. [Online]. Available: https://doi.org/10.1007/BF02289565
  32. S. Lafon and A. Lee, “Diffusion maps and coarse-graining: a unified framework for dimensionality reduction, graph partitioning, and data set parameterization,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 9, pp. 1393–1403, 2006.
  33. D. Dua and C. Graff, “Uci machine learning repository,” 2017. [Online]. Available: http://archive.ics.uci.edu/ml
  34. S. S. Brar, “Heart attack dataset,” Mar 2024. [Online]. Available: https://www.kaggle.com/datasets/sukhmandeepsinghbrar/heart-attack-dataset?resource=download
  35. T. C. Frank E. Harrell Jr. (2017, oct) Titanic dataset. [Online]. Available: https://www.openml.org/d/40945
  36. N. Ck, “Water probability,” Apr 2024. [Online]. Available: https://www.kaggle.com/datasets/nayanack/water-probability
  37. J. Liu, Y. Huang, R. Singh et al., “Jointly embedding multiple Single-Cell omics measurements,” Algorithms Bioinform, vol. 143, Sep. 2019.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 post and received 0 likes.