Relation-Aware Slicing Distribution (RASD)

Updated 4 July 2026

RASD is a slicing distribution on the unit sphere that refines traditional uniform sampling for sliced Gromov–Wasserstein, enhancing cross-domain alignment.
It constructs relation-aware projecting directions from intra-relational paths and bisector-type directions, enabling efficient, optimization-free sampling.
RASD offers improved computational efficiency and statistical guarantees by tuning the concentration parameter to balance uniform and concentrated slicing.

Relation-Aware Slicing Distribution (RASD) is a slicing distribution on the unit sphere introduced for sliced Gromov–Wasserstein computations in cross-domain alignment. It is defined by sampling two pairs of random vectors from the ambient laws, constructing relation-aware projecting directions that encode pairwise associations across the two domains, and then drawing directions from a spherical location-scale law centered at those sampled directions. In this formulation, RASD replaces the uniform directional sampling used in Sliced Gromov–Wasserstein (SGW) and is designed to reduce the contribution of uninformative projections without introducing an inner optimization problem (Sarkar et al., 17 Jul 2025).

1. Cross-domain alignment setting

RASD is formulated in the setting of comparing probability measures $\mu$ and $\nu$ that may live in different metric spaces $(\mathcal X,c_X)$ and $(\mathcal Y,c_Y)$ . In that setting, the relevant baseline is the Gromov–Wasserstein distance, which compares relational structure rather than absolute coordinates. With

$\Pi(\mu,\nu)=\{\pi\in\mathcal P(\mathcal X\times\mathcal Y):\pi(\cdot\times\mathcal Y)=\mu,\ \pi(\mathcal X\times\cdot)=\nu\},$

and

$c_p(x,x',y,y')=|c_X(x,x')-c_Y(y,y')|^p,$

the $p$ -GW distance is

$\text{GW}_p^p(\mu,\nu)=\inf_{\pi\in\Pi(\mu,\nu)}\mathbb E_{\pi\otimes\pi}[c_p(x,x',y,y')].$

The empirical GW problem is a quadratic assignment problem and a non-convex quadratic program, so sliced relaxations are introduced to reduce the computational burden (Sarkar et al., 17 Jul 2025).

In SGW, measures are projected to one dimension along directions $\theta\in\mathbb S^{d-1}$ , and one averages 1D GW values over uniformly sampled directions: $\text{SGW}_p^p(\mu,\nu)=\mathbb E_{\theta\sim\mathcal U(\mathbb S^{d-1})}\big[\text{GW}_p^p(\theta\sharp\mu,\theta\sharp\nu)\big].$ The motivation for RASD is the observation that uniform slicing incurs unnecessary computational costs due to uninformative directions. The analysis in the cross-domain alignment paper states that, in high dimensions, random directions are nearly orthogonal to fixed displacement vectors, so many projections carry weak relational signal. The paper therefore seeks a slicing distribution that is relation-aware, optimization-free, and fast to sample (Sarkar et al., 17 Jul 2025).

2. Construction from relation-aware projecting directions

The construction begins with intra-relational paths. For $\nu$ 0, the intra-relational path is

$\nu$ 1

with normalized version

$\nu$ 2

An analogous definition is used for $\nu$ 3. Given two normalized intra-relational paths, the paper defines two bisector-type directions

$\nu$ 4

One of these is the bisector of the acute angle between the two normalized intra-relational paths. This construction is motivated by the projected distortion

$\nu$ 5

and by the desideratum $\nu$ 6, which scales relational discrepancy by a common factor rather than distorting the two domains asymmetrically (Sarkar et al., 17 Jul 2025).

To move from deterministic bisectors to a full slicing law, the paper uses spherical location-scale distributions such as von Mises–Fisher and Power Spherical: $\nu$ 7 The conditional Relation-Aware Projecting Direction (RAPD) law is

$\nu$ 8

and the Relation-Aware Slicing Distribution is obtained by marginalizing over the sampled quartets: $\nu$ 9 Sampling from RASD is therefore explicit: sample $(\mathcal X,c_X)$ 0, $(\mathcal X,c_X)$ 1, build the two bisectors, choose the mixture component, and sample from the corresponding vMF or PS law (Sarkar et al., 17 Jul 2025).

A useful antecedent is the random-path slicing distribution introduced for sliced Wasserstein distance, where directions are centered around normalized differences $(\mathcal X,c_X)$ 2 for $(\mathcal X,c_X)$ 3, $(\mathcal X,c_X)$ 4. That earlier construction is measure-pair dependent but uses inter-measure random paths under the independent coupling; RASD instead uses two intra-relational paths and their bisectors, reflecting the fact that GW compares pairwise structure across domains rather than pointwise discrepancies in a common space (Nguyen et al., 2024).

3. Distances induced by RASD

RASD enters the sliced GW pipeline by replacing the uniform distribution in SGW. The resulting Relation-Aware Sliced Gromov–Wasserstein distance is

$(\mathcal X,c_X)$ 5

Equivalently,

$(\mathcal X,c_X)$ 6

The Monte Carlo estimator with $(\mathcal X,c_X)$ 7 sampled directions is

$(\mathcal X,c_X)$ 8

These definitions preserve the SGW outer structure while changing only the slicing distribution (Sarkar et al., 17 Jul 2025).

The importance-weighted variant IWRASGW introduces an energy function $(\mathcal X,c_X)$ 9 and reweights a finite set of sampled slices: $(\mathcal Y,c_Y)$ 0 The corresponding Monte Carlo approximation averages this weighted quantity over $(\mathcal Y,c_Y)$ 1 independent groups, with $(\mathcal Y,c_Y)$ 2 often used in practice (Sarkar et al., 17 Jul 2025).

The paper establishes a hierarchy

$(\mathcal Y,c_Y)$ 3

states that

$(\mathcal Y,c_Y)$ 4

and that

$(\mathcal Y,c_Y)$ 5

Accordingly, RASD interpolates between uniform slicing and more concentrated relation-aware sampling through the concentration parameter $(\mathcal Y,c_Y)$ 6 (Sarkar et al., 17 Jul 2025).

4. Metric, statistical, and computational properties

The cross-domain alignment analysis proves that $(\mathcal Y,c_Y)$ 7 and $(\mathcal Y,c_Y)$ 8 are semi-metrics on $(\mathcal Y,c_Y)$ 9 when $\Pi(\mu,\nu)=\{\pi\in\mathcal P(\mathcal X\times\mathcal Y):\pi(\cdot\times\mathcal Y)=\mu,\ \pi(\mathcal X\times\cdot)=\nu\},$ 0 is Polish: they satisfy non-negativity, symmetry, and identity on isometric isomorphism classes. Symmetry is nontrivial because $\Pi(\mu,\nu)=\{\pi\in\mathcal P(\mathcal X\times\mathcal Y):\pi(\cdot\times\mathcal Y)=\mu,\ \pi(\mathcal X\times\cdot)=\nu\},$ 1 is not symmetric in $\Pi(\mu,\nu)=\{\pi\in\mathcal P(\mathcal X\times\mathcal Y):\pi(\cdot\times\mathcal Y)=\mu,\ \pi(\mathcal X\times\cdot)=\nu\},$ 2, but 1D GW invariance under $\Pi(\mu,\nu)=\{\pi\in\mathcal P(\mathcal X\times\mathcal Y):\pi(\cdot\times\mathcal Y)=\mu,\ \pi(\mathcal X\times\cdot)=\nu\},$ 3 and the symmetry of the vMF/PS construction restore symmetry at the level of the induced distance (Sarkar et al., 17 Jul 2025).

The same work gives a quasi-triangle inequality for RASGW, where the right-hand side uses the slicing distribution built from $\Pi(\mu,\nu)=\{\pi\in\mathcal P(\mathcal X\times\mathcal Y):\pi(\cdot\times\mathcal Y)=\mu,\ \pi(\mathcal X\times\cdot)=\nu\},$ 4. It does not claim a full triangle inequality. This places RASGW in the same general category as other pair-dependent sliced distances whose directional law depends on the input pair itself. A plausible implication is that the improved informativeness of the slicing distribution is obtained at the cost of a more delicate global geometry.

On the statistical side, the paper proves a one-sided sample complexity bound

$\Pi(\mu,\nu)=\{\pi\in\mathcal P(\mathcal X\times\mathcal Y):\pi(\cdot\times\mathcal Y)=\mu,\ \pi(\mathcal X\times\cdot)=\nu\},$ 5

for empirical measures $\Pi(\mu,\nu)=\{\pi\in\mathcal P(\mathcal X\times\mathcal Y):\pi(\cdot\times\mathcal Y)=\mu,\ \pi(\mathcal X\times\cdot)=\nu\},$ 6 of compactly supported samples. For Monte Carlo approximation,

$\Pi(\mu,\nu)=\{\pi\in\mathcal P(\mathcal X\times\mathcal Y):\pi(\cdot\times\mathcal Y)=\mu,\ \pi(\mathcal X\times\cdot)=\nu\},$ 7

Computationally, for discrete measures with $\Pi(\mu,\nu)=\{\pi\in\mathcal P(\mathcal X\times\mathcal Y):\pi(\cdot\times\mathcal Y)=\mu,\ \pi(\mathcal X\times\cdot)=\nu\},$ 8 points and $\Pi(\mu,\nu)=\{\pi\in\mathcal P(\mathcal X\times\mathcal Y):\pi(\cdot\times\mathcal Y)=\mu,\ \pi(\mathcal X\times\cdot)=\nu\},$ 9 slices, the stated complexity matches SGW: $c_p(x,x',y,y')=|c_X(x,x')-c_Y(y,y')|^p,$ 0 time and $c_p(x,x',y,y')=|c_X(x,x')-c_Y(y,y')|^p,$ 1 memory (Sarkar et al., 17 Jul 2025).

5. Algorithms and empirical behavior

Algorithmically, RASD is explicitly optimization-free. To compute RASGW, one repeatedly samples quartets $c_p(x,x',y,y')=|c_X(x,x')-c_Y(y,y')|^p,$ 2, constructs RAPD centers, draws $c_p(x,x',y,y')=|c_X(x,x')-c_Y(y,y')|^p,$ 3 from the associated vMF or PS mixture, projects both measures, and computes the resulting 1D GW value. IWRASGW adds slice weights after those projected GW values are available. This differs from Max-SGW and DSGW, which optimize directions or direction distributions, and from energy-based approaches that require more intricate sampling procedures (Sarkar et al., 17 Jul 2025).

The experiments reported in cross-domain alignment include GW-GAN and Gromov–Wasserstein autoencoding. In 3D $c_p(x,x',y,y')=|c_X(x,x')-c_Y(y,y')|^p,$ 4 2D Gaussian-mixture alignment, the reported GW-2 values at step 10000 were approximately $c_p(x,x',y,y')=|c_X(x,x')-c_Y(y,y')|^p,$ 5 for SGW, $c_p(x,x',y,y')=|c_X(x,x')-c_Y(y,y')|^p,$ 6 for DSGW, $c_p(x,x',y,y')=|c_X(x,x')-c_Y(y,y')|^p,$ 7 for EBSGW, $c_p(x,x',y,y')=|c_X(x,x')-c_Y(y,y')|^p,$ 8 for RPSGW, $c_p(x,x',y,y')=|c_X(x,x')-c_Y(y,y')|^p,$ 9 for IWRPSGW, $p$ 0 for RASGW, and $p$ 1 for IWRASGW, with runtimes for RASGW and IWRASGW close to those of SGW and far below DSGW. In CIFAR-10 GWAE experiments, the reported FID values were $p$ 2 for SGW, $p$ 3 for DSGW, $p$ 4 for EBSGW, $p$ 5 for RPSGW, $p$ 6 for RASGW, and $p$ 7 for IWRASGW; IWRASGW also reached the earliest convergence in epochs among the competitive variants. On Omniglot, IWRASGW achieved the best reported FID, $p$ 8, while RASGW achieved $p$ 9 and SGW $\text{GW}_p^p(\mu,\nu)=\inf_{\pi\in\Pi(\mu,\nu)}\mathbb E_{\pi\otimes\pi}[c_p(x,x',y,y')].$ 0 (Sarkar et al., 17 Jul 2025).

The concentration parameter $\text{GW}_p^p(\mu,\nu)=\inf_{\pi\in\Pi(\mu,\nu)}\mathbb E_{\pi\otimes\pi}[c_p(x,x',y,y')].$ 1 functions as a directional selectivity control. In the ablation reported for the 3D $\text{GW}_p^p(\mu,\nu)=\inf_{\pi\in\Pi(\mu,\nu)}\mathbb E_{\pi\otimes\pi}[c_p(x,x',y,y')].$ 2 2D 4-point problem, GW-2 decreased from approximately $\text{GW}_p^p(\mu,\nu)=\inf_{\pi\in\Pi(\mu,\nu)}\mathbb E_{\pi\otimes\pi}[c_p(x,x',y,y')].$ 3 at $\text{GW}_p^p(\mu,\nu)=\inf_{\pi\in\Pi(\mu,\nu)}\mathbb E_{\pi\otimes\pi}[c_p(x,x',y,y')].$ 4 to approximately $\text{GW}_p^p(\mu,\nu)=\inf_{\pi\in\Pi(\mu,\nu)}\mathbb E_{\pi\otimes\pi}[c_p(x,x',y,y')].$ 5 at $\text{GW}_p^p(\mu,\nu)=\inf_{\pi\in\Pi(\mu,\nu)}\mathbb E_{\pi\otimes\pi}[c_p(x,x',y,y')].$ 6, with limited further improvement at larger values. The same ablation showed the expected improvement with more slices: GW-2 decreased from $\text{GW}_p^p(\mu,\nu)=\inf_{\pi\in\Pi(\mu,\nu)}\mathbb E_{\pi\otimes\pi}[c_p(x,x',y,y')].$ 7 at $\text{GW}_p^p(\mu,\nu)=\inf_{\pi\in\Pi(\mu,\nu)}\mathbb E_{\pi\otimes\pi}[c_p(x,x',y,y')].$ 8 to $\text{GW}_p^p(\mu,\nu)=\inf_{\pi\in\Pi(\mu,\nu)}\mathbb E_{\pi\otimes\pi}[c_p(x,x',y,y')].$ 9 at $\theta\in\mathbb S^{d-1}$ 0, $\theta\in\mathbb S^{d-1}$ 1 at $\theta\in\mathbb S^{d-1}$ 2, and $\theta\in\mathbb S^{d-1}$ 3 at $\theta\in\mathbb S^{d-1}$ 4 (Sarkar et al., 17 Jul 2025).

6. Terminological boundaries and neighboring uses of “slicing”

The term “slicing” is highly overloaded, and RASD belongs to a specific optimal-transport lineage. It is not the “distributed slicing” of peer-to-peer systems, where slicing denotes automatic partitioning of P2P networks into groups that represent a controllable amount of some resource and are maintained through gossip-based models [0612035]. It is not the “probabilistic slicing” of program analysis, where slices are computed on probabilistic control-flow graphs using data dependence, postdominators, and probabilistic independence (Amtoft et al., 2017).

It is also distinct from wireless network slicing. A comprehensive O-RAN survey explicitly states that the term “Relation-Aware Slicing Distribution (RASD) does not appear” in that network-slicing literature, whose concerns are RAN/TN/CN slice subnets, SMO/RIC orchestration, and slice lifecycle management rather than sliced GW geometry (Alam et al., 2024). Likewise, “Slice Agent” in shared Open-RU design refers to identifying and isolating uplink traffic into slice-specific eCPRI packets, with formulas such as

$\theta\in\mathbb S^{d-1}$ 5

which is a transport-and-hardware problem rather than a slicing-distribution problem in optimal transport (Arnholda et al., 28 Apr 2026).

Within the broader sliced-optimal-transport literature, the closest conceptual precursor is the random-path slicing distribution, which samples directions from a location-scale law centered at normalized differences $\theta\in\mathbb S^{d-1}$ 6 and yields Random-Path Projection Sliced Wasserstein variants. RASD inherits the optimization-free, fast-sampling design principle but specializes it to GW by basing directions on relations between two intra-domain displacement pairs rather than on direct inter-measure differences (Nguyen et al., 2024). In that restricted sense, RASD is best understood as a relation-aware directional law for sliced Gromov–Wasserstein alignment, rather than as a generic label for all slice-aware methods.