Structure-Preserving Multi-View Embedding Using Gromov-Wasserstein Optimal Transport

Published 3 Apr 2026 in stat.ML and cs.LG | (2604.02610v1)

Abstract: Multi-view data analysis seeks to integrate multiple representations of the same samples in order to recover a coherent low-dimensional structure. Classical approaches often rely on feature concatenation or explicit alignment assumptions, which become restrictive under heterogeneous geometries or nonlinear distortions. In this work, we propose two geometry-aware multi-view embedding strategies grounded in Gromov-Wasserstein (GW) optimal transport. The first, termed Mean-GWMDS, aggregates view-specific relational information by averaging distance matrices and applying GW-based multidimensional scaling to obtain a representative embedding. The second strategy, referred to as Multi-GWMDS, adopts a selection-based paradigm in which multiple geometry-consistent candidate embeddings are generated via GW-based alignment and a representative embedding is selected. Experiments on synthetic manifolds and real-world datasets show that the proposed methods effectively preserve intrinsic relational structure across views. These results highlight GW-based approaches as a flexible and principled framework for multi-view representation learning.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper introduces two GW optimal transport-based methods—Mean-GWMDS and Multi-GWMDS—for geometric multi-view dimensionality reduction.
It leverages intrinsic pairwise distance matrices to align heterogeneous views and efficiently preserve nonlinear manifold structures, outperforming baseline methods.
Empirical results on synthetic manifolds and electricity load data validate its robust performance in handling nonlinearity and noise.

Structure-Preserving Multi-View Embedding Using Gromov-Wasserstein Optimal Transport

Introduction

The paper "Structure-Preserving Multi-View Embedding Using Gromov-Wasserstein Optimal Transport" (2604.02610) presents a geometric approach to the multi-view dimensionality reduction (DR) problem. The core motivation is the recovery of a faithful, low-dimensional embedding of data sampled from multiple heterogeneous views, with a focus on preserving the intrinsic relational (geometric) structures across these views. The proposed methodology leverages the Gromov-Wasserstein (GW) optimal transport (OT) formulation, which allows comparison and alignment of distributions supported on distinct metric spaces—a critical property for multi-view and manifold data subject to complex, nonlinear distortions.

Gromov-Wasserstein Optimal Transport for Multi-View Embedding

The GW distance provides a formal metric to assess the discrepancy between metric measure spaces based solely on their intrinsic distance matrices. Unlike classical OT, GW-OT does not assume a shared ambient space between distributions. This independence from coordinate systems and dimensions is essential for multi-view settings where features or modalities may be fundamentally incompatible.

The paper extends GW-based multidimensional scaling (GW-MDS), which optimizes over low-dimensional representations to preserve intrinsic pairwise relations as measured by the GW distance, to the multi-view scenario. Two complementary strategies are proposed:

Mean-GWMDS: Directly averages view-specific distance matrices and applies GW-MDS, yielding an efficient and robust consensus embedding.
Multi-GWMDS: Generates multiple candidate embeddings via GW alignments and selects the most representative embedding according to cross-view consistency measures, capturing geometric commonality robustly in the presence of heterogeneous or noisy views.

Methodological Framework

Mean-GWMDS provides a computationally pragmatic baseline, requiring only a single GW-MDS run on a mean distance matrix. While this method is efficient and generally stable when the geometric distortions across views are not severe, it risks amplifying biases from highly corrupted views.

Multi-GWMDS takes a more nuanced, selection-based approach. Each view is embedded using GW-MDS, yielding a set of candidate geometric representations. Cross-view consistency is then scored—typically using distance matrix correlation—and a representative embedding is chosen to maximize ensemble agreement. This approach is naturally robust to heterogeneity, outliers, and non-uniform noise across views.

Figure 1: Schematic illustration of the Multi-GWMDS selection-based strategy for multi-view embedding construction.

Empirical Evaluation: Synthetic Manifolds

A comprehensive set of experiments on classical synthetic geometric manifolds (S-curve, Swiss Roll, Mobius strip, Torus) was conducted. Data for each manifold was transformed into two distinct views via nontrivial rigid and affine transformations, introducing significant geometric disparity.

When using Euclidean distances, Mean-GWMDS consistently produced embeddings with the highest average correlation to ground-truth manifold coordinates, with Multi-GWMDS performing competitively—particularly excelling in capturing view-specific relational fidelity. Both GW-based methods outperformed canonical correlation (CCA), multiset CCA (MCCA), and multi-view MDS (MVMDS) baselines, especially in reconciling view heterogeneity and nonlinear structure.

Switching to geodesic distances—approximated using $k$ -nearest neighbor graphs—further amplified the advantages of the GW-based approaches. Multi-GWMDS achieved state-of-the-art preservation of nonlinear manifold topology across all tested settings.

Figure 2: Multi-GWMDS visualization on synthetic manifolds, demonstrating superior structure preservation under complex distortions.

Real-World Analysis: Electricity Load Diagrams

The methodology was validated on the Electricity Load Diagrams (ELD) dataset, representing multi-day electricity consumption across 370 customers (with each day as a separate view). Geodesic distance matrices were computed for each day's data, capturing routine and anomalous consumption patterns.

Mean-GWMDS produced embeddings reflecting aggregated, consensus customer behavior, yielding compact low-dimensional clusters. In contrast, Multi-GWMDS revealed subtler, day-specific patterns by selecting embeddings maximally consistent with view-wise structure, exposing inherent temporal heterogeneity.

Figure 3: Low-dimensional embeddings of the Electricity Load Diagrams (ELD) dataset obtained with Multi-GWMDS using geodesic distances. Each panel shows an embedding from a different view.

Tabled correlation statistics confirmed that while Mean-GWMDS was more robust to average-case view structure (with consistently strong performance across views), Multi-GWMDS achieved the strongest performance in detecting meaningful, view-specific patterns and provided better mean robustness to structural noise.

Figure 4: Representative embedding of the ELD dataset with Multi-GWMDS showing the geometrically consistent customer clusters.

Discussion and Theoretical Implications

A key finding is that explicit modeling of view-dependent relational alignments via GW optimal transport yields embeddings that are highly robust to both nonlinear geometric deformations and noise. This is particularly notable under geodesic dissimilarities, where Multi-GWMDS and Mean-GWMDS both outperform established multi-view manifold learning methods (e.g., Multi-Isomap [rodosthenous2024multi]).

Strong numerical results are reported across both synthetic and real data, establishing that GW-based embedding approaches are uniquely capable of integrating heterogeneous, nonlinearly related representations—a significant step beyond classical, feature-based or correlation-based fusion strategies.

The selection-based Multi-GWMDS approach is particularly well suited to scenarios with variable view quality, as it naturally discounts outlier or low-fidelity views without requiring complex weight tuning.

Future Directions and Broader Impacts

The theoretical and practical contributions of this work open multiple avenues for further research:

Adaptive weighting: Learning the view-weight vector $\lambda_v$ to explicitly privilege views according to downstream criteria or supervision.
Mixed dissimilarity integration: Combining heterogeneous dissimilarity measures (e.g., Euclidean and geodesic) for richer geometric modeling.
Semi-relaxed barycenters and clustering: Exploiting semi-relaxed GW divergences (see [clark2024generalized]) for prototype learning and structure discovery in multi-view clustering.
Application to graphs and complex relational data: Generalizing the framework to integrate relational data structures beyond classical numerical feature vectors, e.g., graphs, hypergraphs, or multimodal biomedical data.

Methodologically, the approach provides a foundation for developing robust DR and representation learning pipelines suitable for complex, multi-source datasets as increasingly encountered in modern machine learning pipelines for imaging, bioinformatics, and sensor fusion.

Conclusion

This paper introduces two principled, geometry-aware approaches to multi-view dimensionality reduction via the Gromov-Wasserstein optimal transport framework. Both approaches—Mean-GWMDS and Multi-GWMDS—demonstrate strong performance in preserving intrinsic relational structure across heterogeneous views and outperform correlation- and Euclidean-based baselines, particularly under nonlinear and noisy conditions. The work situates GW-OT as a robust and theoretically sound tool for modern multi-view and manifold learning, with broad implications for future advances in structure-preserving representation learning.

Markdown Report Issue