Bispectral Optimal Transport: Theory & Applications

Updated 27 September 2025

Bispectral Optimal Transport is a framework that uses third-order bispectral invariants to align distributions while discounting nuisance rotations.
It transforms data into a group-theoretic Fourier space, enabling robust feature extraction and invariant cost computation.
Empirical evaluations show that BOT achieves over 80% class preservation on rotated datasets, demonstrating its practical robustness.

Bispectral Optimal Transport (BOT) refers to a class of optimal transport methods that incorporate higher-order signal invariants—specifically the bispectrum—to achieve alignment and comparison of distributions or datasets in a manner robust to underlying symmetries, such as rotations. BOT leverages group-theoretic Fourier invariants to remove nuisance variations induced by group actions, preserving the intrinsic semantic structure of the data. This approach addresses limitations of conventional optimal transport, which often relies on geometric distances between raw features and can misrepresent relationships in symmetry-rich settings.

1. Mathematical Formulation and Principles

Traditional optimal transport (OT) seeks a coupling $\Gamma$ between discrete distributions $\mu = \sum_{i=1}^n p_i \delta_{x^{(i)}}$ and $\nu = \sum_{j=1}^m q_j \delta_{y^{(j)}}$ by minimizing

$OT_c(\mu, \nu) = \min_{\Gamma \geq 0} \langle \Gamma, C \rangle$

subject to $\Gamma \mathbf{1} = p$ and $\Gamma^\top \mathbf{1} = q$ , where $C_{ij}$ encodes pairwise costs (e.g., $\ell_2$ distances).

Bispectral Optimal Transport modifies this by calculating costs in a feature space derived from the bispectrum—an invariant under specified group actions. The bispectrum, originating from group Fourier analysis, is a third-order statistic that, for a signal $f$ , takes the form

$B_{i,j} = \hat{f}_i \cdot \hat{f}_j \cdot (\hat{f}_{i+j})^*$

where $\hat{f}_i$ denotes the $i$ -th Fourier coefficient and $(\cdot)^*$ denotes complex conjugation.

Generalizing to group-based data (e.g., under rotations), the group Fourier transform is

$\hat{f}_\rho = \sum_{g \in G} f(g) \rho(g)$

with $\rho$ running over irreducible representations of $G$ . The bispectrum is then

$B_{\rho_i,\rho_j} = \hat{f}_{\rho_i} \cdot \hat{f}_{\rho_j} \cdot \hat{f}_{\rho_i\rho_j}$

which acts as a complete invariant with respect to the group action.

In BOT, the pairwise cost matrix $C$ is computed from distances in the bispectral feature space, yielding a transport plan that is invariant to the underlying group symmetries.

2. Symmetry-Awareness via the Bispectrum

Symmetry-awareness in BOT arises from the property that the bispectrum, unlike the power spectrum, retains all signal structure except for the phase changes induced by group actions. For data subject to rotation (SO(2)), transformation into a polar grid followed by per-radius 1D Fourier analysis allows the bispectrum to neutralize rotation-induced phase shifts, yielding representations where semantic structure (such as object identity in images) is preserved and rotation is factored out.

BOT embeddings typically proceed by:

Transforming each data object (e.g., image) to polar coordinates.
Computing the 1D discrete Fourier transform along the angular axis at each radius.
Calculating the bispectrum across all radii, yielding a global SO(2)-invariant feature.
Defining cost matrix $C$ via a chosen norm (e.g., $\ell_1$ , $\ell_2$ , cosine) on bispectral features.

By adopting this invariance, BOT transport plans substantially improve the meaningfulness of dataset correspondences, aligning objects by class and structure rather than by superficial geometric transformation.

3. Algorithmic Workflow and Implementation

BOT operates as follows:

Step	Description	Output
1	Polar grid discretization	Angles $\times$ radii array
2	DFT per radius (angular)	Fourier coefficients $\hat{f}_r(\theta)$
3	Bispectrum calculation	Invariant $B_{i,j}^r$ per radius
4	Concatenate per radius	SO(2)-invariant global feature vector

After embedding the dataset into bispectral feature space, the cost matrix $C$ is computed via pairwise distances and the standard linear programming OT solver is applied, producing $\Gamma^*$ and aligned correspondences between sets.

Selection of the norm for $C$ may vary with application. The $\ell_1$ distance in bispectral space has demonstrated strong empirical performance, notably in rotation-rich image matching scenarios.

4. Empirical Evaluation and Performance Metrics

In evaluations conducted on benchmark datasets (MNIST, Fashion-MNIST, Kuzushiji-MNIST, EMNIST), each set was split, with one half subjected to random rotations. Conventional OT using raw pixel distances yielded transport plans strongly confounded by rotation, resulting in low class preservation accuracy (e.g., $33\%$ on MNIST). BOT, using bispectral feature distance, achieved class preservation rates exceeding $80\%$ with $\ell_1$ cost, indicating robust alignment by semantic class rather than transformation.

Multidimensional scaling (MDS) visualizations indicated that bispectral representations naturally cluster by class and not by rotation, confirming the theoretical invariance properties and practical robustness.

5. Connections to Matrix-Valued and Generalized Transport

Matrix-valued extensions of optimal transport, such as the Kantorovich–Bures distance (Brenier et al., 2018), generalize scalar mass movement to matrix-valued measures, blending spatial transport with changes in internal spectral (eigenvalue) structure. The dynamical variational formulation for two matrix measures $G_0, G_1$ is

$d_{KB}^2(G_0, G_1) = \inf_{(G_t, \mathfrak{U}_t)} \int_0^1 \left(\int dG_t\, u_t \cdot u_t + \int dG_t\, U_t : U_t \right) dt$

subject to

$\partial_t G_t = \{-\nabla(G_t u_t) + G_t U_t\}^{Sym}$

The interplay of transport and reaction terms mirrors the “bispectral” paradigm where both spatial and spectral aspects are fused. The conic geometry, explicit geodesics, and formal Otto calculus on the matrix-valued measure space furnish theoretical underpinnings applicable to bispectral OT scenarios, especially when internal structure (matrix eigenvalues, higher-order statistics) forms a critical aspect of the signal.

6. Applications and Extensions

Bispectral Optimal Transport exhibits significant utility in applications requiring robust correspondence under symmetry:

Machine Learning and Domain Adaptation: Enabling transfer learning and dataset alignment where nuisance transformations (rotations, permutations) would otherwise obscure class relationships.
Computer Vision and Graphics: Image registration and matching under affine or rotational transformations benefit from transport plans preserving semantic content.
Generative Modeling and Imitation Learning: Ensuring that distributions of learned representations correspond under the essential, non-nuisance variations, enhancing interpretability and functional reliability.
Signal Processing and Pattern Recognition: Bispectral analysis naturally captures nonlinear phase coupling missed by second-order (spectral) methods. BOT allows robust metric comparison and feature extraction based on these high-order statistics (Martín et al., 19 Jun 2024).

A plausible implication is that future extensions may incorporate more general group actions, non-commutative bispectral invariants, and matrix-valued cost functionals for further robustness and flexibility.

7. Future Directions and Open Challenges

Potential research avenues highlighted in BOT literature (Ma et al., 25 Sep 2025) include:

Scalability to Large and Complex Symmetry Groups: Extending bispectral invariance beyond rotation (SO(2)) to higher dimensions, non-Abelian groups, and non-gridded data.
Sophisticated Metrics in Bispectral Space: Development of distance measures that exploit the algebraic and geometric structure of bispectral features, moving beyond standard norms.
Matrix-Valued and Multimodal Transport: Integration with approaches such as Kantorovich–Bures, coupling both spatial transport and internal spectral transport, for tensor fields and multi-modal data.

This suggests ongoing research will seek an overview between the mathematical theory of optimal transport, group invariants, and practical embedding pipelines, pushing the boundaries of dataset comparison in symmetry-rich and multi-structured domains.

PDF Markdown Chat (Pro)

References (3)

On optimal transport of matrix-valued measures (2018)

Data representation with optimal transport (2024)

Bispectral OT: Dataset Comparison using Symmetry-Aware Optimal Transport (2025)

Follow Topic

Get notified by email when new papers are published related to Bispectral Optimal Transport.