Papers
Topics
Authors
Recent
Search
2000 character limit reached

Partial Soft-Matching Distance in Optimal Transport

Updated 4 July 2026
  • Partial soft-matching distance is a method that relaxes strict one-to-one matching constraints by using soft correspondences and allowing some elements to remain unmatched.
  • It is applied across neural representation comparison, image–text retrieval, subgraph matching, and shape matching to handle noise and partial overlaps.
  • The approach bridges continuous optimal transport techniques with discrete matching models, providing insights into algorithm design and robustness against outliers.

Partial soft-matching distance denotes a family of relaxed correspondence objectives in which matching is soft because correspondences are represented by nonnegative couplings, soft correspondence matrices, or soft masks, and partial because some mass, fragments, nodes, units, or pairwise constraints may remain unmatched. The phrase is used explicitly for neural representational comparison in a partial optimal transport formulation (Kapoor et al., 22 Feb 2026), while image–text retrieval, subgraph matching, and partial shape matching use closely related constructions under different names, including entropic balanced OT over an augmented space with dustbins, partial fused Gromov–Wasserstein, and soft-masked geodesic consistency losses (Pan et al., 15 Mar 2026, Pan et al., 2024, Bracha et al., 2024). A complementary baseline is the hard injective partial-matching RMS model under translation, which fixes the opposite extreme of discrete, injective, unweighted correspondences and thereby clarifies what is changed when one passes to softer or more permissive formulations (Ben-Avraham et al., 2014).

1. Conceptual scope and recurrent formulations

Across the literature, there is no single canonical object called partial soft-matching distance. Instead, the term groups several mathematically distinct mechanisms that all relax exact one-to-one or full-mass matching. In neural representational comparison, the central object is a partial optimal transport distance over tuning curves. In image–text retrieval, the practical alignment object is a soft partial transport plan, but the final retrieval quantity is a similarity score induced by that plan, not a raw distance. In subgraph matching, the objective is a partial fused Gromov–Wasserstein discrepancy. In partial shape matching, the optimized object is often a soft correspondence matrix or a soft mask inside a loss rather than a metric in the strict sense (Khosla et al., 2023).

Setting Soft component Partial component
Neural representations transportation polytope T(Nx,Ny)T(N_x,N_y) or partial coupling TT only total mass ss is transported
Image–text retrieval entropic OT / Sinkhorn coupling Γ\Gamma global “dustbin” nodes absorb irrelevant fragments
Subgraph matching relaxed transport plan T\boldsymbol T Ts(p,q)\mathcal{T}_s(\boldsymbol p,\boldsymbol q), dummy-node slack
Partial shape matching soft correspondence matrix P\boldsymbol P, soft mask Ms\boldsymbol M^s only guaranteed or softly weighted pairs contribute

A useful terminological boundary is that “soft” does not always mean entropic OT. In some papers, softness is the use of a transport coupling spread across many pairs; in others, it is a soft correspondence matrix, a soft mask, or a class-probability-based adjustment of a transport cost matrix. Likewise, “partial” may refer to fixed transported mass, relaxed marginal constraints, augmented dustbins, class reweighting, or selective pair inclusion. This suggests that the phrase is best treated as a comparative umbrella rather than a single standardized metric.

An older and distinct line of work in soft set theory critiques matrix-based matching-function similarity and replaces it with set-operations-based distances that explicitly compare parameter-set overlap ABA\cap B, symmetric difference AΔBA\Delta B, and value-set mismatch TT0. That literature is not OT-based, but it is relevant because it also treats partial overlap as intrinsic to the distance definition rather than as missing data to be totalized (Kharal, 2010).

2. Hard injective partial matching as the opposite extreme

The clearest hard baseline is the partial-matching RMS model under translation. Let

TT1

with TT2. A partial matching is a maximum-cardinality matching of TT3 into TT4, that is, an injective assignment TT5. For fixed point locations, the minimum partial-matching RMS objective is

TT6

and under translation

TT7

The model is explicitly a strict partial matching, not a soft one: every point of TT8 must be assigned to exactly one distinct point of TT9, each point of ss0 can be used at most once, and there are no fractional correspondences, no probabilistic weights, no entropy or regularization terms, and no many-to-one assignments (Ben-Avraham et al., 2014).

For fixed ss1, the translation dependence is

ss2

with

ss3

Hence the full objective ss4 is the lower envelope of finitely many quadratics, equivalently of affine functions after subtracting the common term ss5. The induced subdivision of translation space, ss6, is convex; its faces are convex polygons; and multiple distinct matchings can be simultaneously optimal on an open set only if they match the same subset of ss7. This geometric structure is a direct consequence of hard injectivity.

Several structural bounds make this baseline important. A line intersects the interior of at most ss8 regions of ss9. Every edge of Γ\Gamma0 has a normal vector of the form Γ\Gamma1. The number of unbounded regions is at most Γ\Gamma2, every region has at most Γ\Gamma3 edges, every vertex has degree at most Γ\Gamma4, and any convex path intersects at most Γ\Gamma5 regions. The global combinatorial complexity remains open, but the paper proves

Γ\Gamma6

as an upper bound and

Γ\Gamma7

as a lower bound on the number of regions.

Algorithmically, the same paper gives a polynomial-time algorithm for a local minimum,

Γ\Gamma8

and an exact global algorithm obtained by traversing the full subdivision,

Γ\Gamma9

with the same order again for computing the global minimum once the subdivision is constructed. This establishes the hard injective model as a rigorous comparison point: a soft model may reduce to it in a zero-temperature or hard-assignment limit, but it does not inherit its exact convex-subdivision structure.

3. From balanced soft matching to partial optimal transport

The modern OT-based notion of soft matching begins with the soft matching distance between neural representations. For activation matrices

T\boldsymbol T0

with columns T\boldsymbol T1, the transportation polytope is

T\boldsymbol T2

and the soft matching distance is

T\boldsymbol T3

This is balanced OT between the empirical measures

T\boldsymbol T4

so it is a full soft matching: all neuronal mass is matched. The same work identifies it with a T\boldsymbol T5-Wasserstein distance between empirical distributions and states that it is symmetric and satisfies the triangle inequality (Khosla et al., 2023).

Partial soft-matching distance in the explicit sense extends this balanced formulation by allowing only a prescribed fraction T\boldsymbol T6 of total mass to be transported. The admissible partial couplings are

T\boldsymbol T7

and the partial soft-matching distance is

T\boldsymbol T8

with transport cost T\boldsymbol T9. In practice, the paper states, “In our formulation, we use pairwise cosine distance as the cost function.” After mean-centering and unit-normalizing tuning curves, the same optimization can be written as maximizing matched correlation,

Ts(p,q)\mathcal{T}_s(\boldsymbol p,\boldsymbol q)0

The row sums and column sums of the optimal plan quantify participation in the match, so near-zero sums identify effectively unmatched units (Kapoor et al., 22 Feb 2026).

A structurally analogous construction appears in subgraph matching via partial fused Gromov–Wasserstein. With masses Ts(p,q)\mathcal{T}_s(\boldsymbol p,\boldsymbol q)1, cost matrix Ts(p,q)\mathcal{T}_s(\boldsymbol p,\boldsymbol q)2, structure matrices Ts(p,q)\mathcal{T}_s(\boldsymbol p,\boldsymbol q)3, and structural loss tensor

Ts(p,q)\mathcal{T}_s(\boldsymbol p,\boldsymbol q)4

the partial transport set is

Ts(p,q)\mathcal{T}_s(\boldsymbol p,\boldsymbol q)5

and the partial FGW objective for subgraph matching is

Ts(p,q)\mathcal{T}_s(\boldsymbol p,\boldsymbol q)6

Here the matching is soft because Ts(p,q)\mathcal{T}_s(\boldsymbol p,\boldsymbol q)7 is a fractional coupling rather than a discrete one-to-one assignment, and partial because only total mass Ts(p,q)\mathcal{T}_s(\boldsymbol p,\boldsymbol q)8 is transported (Pan et al., 2024).

4. Dustbins, masks, and other realizations of partiality

A second major family realizes partiality not through explicit inequality marginals but through augmented state spaces, masks, or confidence mechanisms. In image–text retrieval, cross-modal matching is formulated as OT between local visual embeddings

Ts(p,q)\mathcal{T}_s(\boldsymbol p,\boldsymbol q)9

and local textual embeddings

P\boldsymbol P0

with default uniform marginals P\boldsymbol P1, P\boldsymbol P2, cost matrix P\boldsymbol P3 defined by

P\boldsymbol P4

and entropic Sinkhorn objective

P\boldsymbol P5

The partial mechanism is to add one global visual embedding P\boldsymbol P6 and one global textual embedding P\boldsymbol P7 as auxiliary “dustbins,” extend the transport problem from P\boldsymbol P8 to P\boldsymbol P9, and compute the final score only from the local-local block Ms\boldsymbol M^s0, discarding dustbin assignments. The paper is explicit that the result is a soft partial transport plan, whereas the final retrieval quantity

Ms\boldsymbol M^s1

is a similarity score induced by that plan, not a raw metric distance (Pan et al., 15 Mar 2026).

In partial domain adaptation, the central object is a soft-masked semi-dual OT distance between a target domain and a reweighted source domain. The source class mixture is reweighted by class-prior ratios

Ms\boldsymbol M^s2

so that source-only classes are downweighted or ideally removed, and the pairwise transport cost is adjusted by a soft mask matrix

Ms\boldsymbol M^s3

The matching is partial because transport is focused on the shared-label part of the source distribution, and soft because class compatibility is encoded through probability vectors rather than hard class assignments (Zhai et al., 3 May 2025).

In partial-to-partial 3D shape registration, the proposed Confidence Guided Distance

Ms\boldsymbol M^s4

combines a soft feature similarity matrix Ms\boldsymbol M^s5, nearest-neighbor Euclidean distances after a candidate rigid transform, and a confidence mechanism derived from column-normalized Ms\boldsymbol M^s6,

Ms\boldsymbol M^s7

This is not OT; rather, it is a feature-weighted Chamfer-style consensus score plus confidence-guided overlap sampling. Partiality is handled implicitly because non-overlapping points tend to have poor feature similarity and because transform hypotheses are generated from high-confidence points likely to lie in overlap (Ginzburg et al., 2022).

In partial shape matching on manifolds, the optimized object is again not a metric but a masked geodesic distance-preservation loss. The wormhole criterion defines a threshold matrix

Ms\boldsymbol M^s8

a binary mask

Ms\boldsymbol M^s9

and a soft relaxation

ABA\cap B0

The resulting Wormhole Loss uses a stochastic soft correspondence matrix ABA\cap B1 and weights only guaranteed-consistent or softly weighted pairs in

ABA\cap B2

Here partiality is pair selection, and softness comes both from ABA\cap B3 and from ABA\cap B4 (Bracha et al., 2024).

5. Metric status, optimization, and algorithmic structure

A central distinction is that “distance” is used nonuniformly across the literature. Balanced soft matching is a Wasserstein distance between empirical distributions and is stated to be symmetric and to satisfy the triangle inequality. Partial soft-matching distance in the partial OT sense is symmetric but, as the paper explicitly notes, partial OT distances do not satisfy the triangle inequality and therefore are not proper metrics. In subgraph matching, the partial FGW objective is safest to interpret as a partial graph discrepancy or relaxed matching objective rather than as a true metric, because the paper does not prove metric axioms for the partial subgraph version (Khosla et al., 2023, Kapoor et al., 22 Feb 2026).

Other constructions are even further from strict metric status. The cross-modal transport model in image–text retrieval originates from a transport cost, but the operational quantity for ranking and training is the transport-plan-weighted average cosine similarity. Confidence Guided Distance is a consensus score for rigid-registration hypotheses. Wormhole Loss is a masked geodesic-preservation energy. Soft-masked semi-dual OT in partial domain adaptation is a reweighed distance metric embedded inside an alternating end-to-end objective. A recurrent misconception is therefore to treat all partial soft-matching constructions as metrics; the literature supports a broader taxonomy that includes distances, discrepancies, similarities, and losses.

Optimization methods reflect this diversity. Hard injective partial matching under translation uses combinatorial geometry and minimum-cost injective bipartite matching. Balanced neural soft matching is a discrete OT linear program, solved in practice via the network simplex algorithm with stated complexity ABA\cap B5 when both representations have ABA\cap B6 units. The explicit partial neural extension uses partial OT, dummy or virtual points assigned a large transportation cost, and extracts row and column masses from a single solution to rank units, reducing subset-selection cost from brute-force ABA\cap B7 to a single ABA\cap B8 solve at a chosen transported mass ABA\cap B9 (Kapoor et al., 22 Feb 2026).

Graph-based partial FGW is nonconvex because of its quadratic structural term and is optimized by Frank–Wolfe on an augmented problem with a dummy node; the expensive tensor–matrix product is reduced from naive AΔBA\Delta B0 to AΔBA\Delta B1 in favorable cases. Image–text partial transport uses entropy-regularized OT and Sinkhorn/Bregman iterations, explicitly with a very small number of iterations AΔBA\Delta B2 in the reported implementation. Partial domain adaptation replaces full Sinkhorn with a semi-dual formulation optimized by gradient-based algorithms and a neural network approximation of the Kantorovich potential. These differences show that there is no single algorithmic signature of partial soft matching.

A further source of terminological ambiguity comes from neighboring areas. In 2-parameter persistence, the standard matching distance is

AΔBA\Delta B3

a supremum of weighted bottleneck distances over slices. This paper gives efficient approximation algorithms for the standard matching distance, but it does not define a partial soft variant; the only “partial matching” present there is the standard bottleneck partial matching between persistence diagrams (Kerber et al., 2019). This contrast is useful because it isolates a different sense of “matching distance” from the partial soft-matching literature.

6. Empirical roles and interpretive significance

The principal empirical role of partial soft matching is robust comparison under outliers, partial overlap, or heterogeneous relevance. In neural representational comparison, partial soft-matching preserves correct matches under added outliers, correctly selects the better model in noise-corrupted identification tasks, automatically excludes low-noise-ceiling voxels in fMRI, improves precision of cross-subject voxel alignment across visual areas, and yields unit rankings by alignment quality from a single transport solve. In deep networks, highly matched units exhibit similar maximally exciting images, while unmatched units show divergent patterns. Random orthogonal rotation reduces alignment even within the best-matched subpopulation, which the paper interprets as evidence for privileged axes (Kapoor et al., 22 Feb 2026).

In image–text retrieval, the practical motivation is redundant alignment: not every image fragment corresponds to some caption token, and vice versa. The reported ablations isolate the benefit of the partial mechanism itself: adding dustbins through local-to-global similarities improves over naïve OT, whereas simply appending global features does not. The paper further reports that analogous partial matching helps CAM, supporting the broader claim that redundant local alignments are a genuine issue in image–text matching (Pan et al., 15 Mar 2026).

In graph matching, partial transport is valuable because subgraph search should not force a full graph-to-graph match. The reported experiments show robustness to noisy query features and favorable query times, especially when the sliding-subgraph strategy SSOT restricts optimization to local candidate subgraphs. In 3D registration, the motivating difficulty is that some points in one cloud have no corresponding point in the other; the CGD formulation addresses this with learned feature compatibility, confidence-guided sampling, and a bidirectional nearest-neighbor score that outperforms Chamfer as a consensus metric under severe partiality, outliers, internal symmetries, and large rotations (Pan et al., 2024, Ginzburg et al., 2022).

Partial shape matching on manifolds uses the same broad idea at the level of trustworthy pairwise constraints. The wormhole criterion certifies many more consistent pairs than a prior boundary criterion, and the non-binary mask improves over binary masking. The reported effect is largest on datasets with more holes and stronger topological changes, where forcing all pairwise geodesic constraints to contribute is most harmful. In partial domain adaptation, soft-masked OT suppresses source outlier classes through class reweighting and class-probability-based masking; ablations identify the importance weights and mask mechanisms as the main contributors to reducing negative transfer (Bracha et al., 2024, Zhai et al., 3 May 2025).

Taken together, these works support a stable comparative picture. Hard injective partial matching, balanced soft matching, and partial soft matching are not interchangeable names for one method but distinct regimes of correspondence modeling. Hard injective models enforce discrete global consistency; balanced soft models allow fractional correspondences but still transport all mass; partial soft models additionally allow some mass, fragments, nodes, units, or pairwise constraints to remain unused. This suggests that the enduring significance of partial soft-matching distance lies less in any single formula than in a recurring design principle: robust comparison is often obtained by keeping correspondences soft while refusing to force universal participation.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Partial Soft-Matching Distance.