Barycentric Feature Distillation

Updated 25 January 2026

The paper introduces barycentric feature distillation, a method that transfers deep semantic features onto 3D meshes using barycenter-based interpolation for precise, real-time deformations.
It leverages multi-view deep feature extraction and a lightweight MLP to efficiently map image-derived features to a continuous 3D field, independent of mesh topology.
The technique also extends to dataset distillation with Wasserstein barycenters, achieving competitive accuracy while enabling extreme data compression and cross-architecture generalization.

Barycentric Feature Distillation refers to a class of techniques for summarizing or transferring information in high-dimensional feature spaces by leveraging barycentric coordinates or barycenter constructions, often in the context of deep learning for 3D shape editing and dataset distillation. Two main instantiations have driven its prominence: the barycentric feature distillation pipeline for high-resolution, semantically regularized mesh deformation (Liu et al., 18 Jan 2026), and the Wasserstein barycentric feature distillation framework for dataset distillation (Liu et al., 2023). Both employ barycentric or barycenter-based constructions to align embedded representations efficiently, enabling semantically structured editing or highly compressed dataset synthesis.

1. Deep Feature-to-Geometry Distillation

Handle-based mesh deformation, such as ARAP or biharmonic coordinates, provides efficient and precise shape manipulation but lacks semantic awareness. Barycentric feature distillation bridges the gap between such geometric frameworks and the semantic priors encoded in modern vision networks by distilling deep 2D features into a continuous 3D field over the mesh (Liu et al., 18 Jan 2026).

Given a 3D mesh $M=(V,F)$ and a pretrained 2D feature extractor $F_{\mathrm{2D}}$ (e.g., DINOv2), the process is as follows:

The mesh is rendered from diverse viewpoints, producing RGB images.
Deep features $Z_{ij} = F_{\mathrm{2D}}(I)_{ij}$ are extracted for each camera pixel.
Using triangle rasterization, every camera pixel inside a mesh triangle is mapped onto the 3D surface using barycentric weights $\lambda=(\lambda_a,\lambda_b,\lambda_c)$ with $P_{ij} = \sum_t \lambda_t v_t$ .
A small MLP $\varphi:\mathbb{R}^3\to\mathbb{R}^d$ is trained so that $\varphi(P_{ij})$ matches the normalized deep feature $Z_{ij}/\lVert Z_{ij} \rVert$ at that point.
Distillation complexity depends only on image resolution, not mesh topology, allowing real-time field recovery even for meshes with up to $10^6$ faces.

This continuous feature field, $\varphi(x)$ , enables immediate evaluation of semantic features at any mesh vertex, providing a direct link from image-based semantics to geometric manipulation.

2. Mathematical Formulation and Optimization

The fitting objective for barycentric feature distillation is constructed as a per-pixel loss over rasterized points: $E_{\mathrm{distill}}(\varphi) = \sum_{(i,j)\in\Omega} \lVert \varphi(P_{ij}) - \hat{Z}_{ij}\rVert_2^2$ where $\Omega$ is the set of all rendered pixels covering mesh faces. The MLP $\varphi$ is optimized using Adam over batches of $(P_{ij}, \hat{Z}_{ij})$ pairs (Liu et al., 18 Jan 2026).

To map features back to deformation weights, feature proximity is used: $W_{ij}=\max\{F_{\mathrm{sim}}(Z_{v_i}, Z_{v_j}), 0\}, \quad F_{\mathrm{sim}}(u,v)=1-\lVert u-v\rVert_2$ where $Z_{v_i}=\varphi(v_i)$ are per-vertex features after distillation. Subsequent handle-based deformations are applied using these semantically informed weights with classical linear blend skinning, allowing $O(nK)$ run-time complexity for $n$ vertices and $K$ handles.

Optional geometric post-processing includes locality weighting by normalized geodesic distance and feature-anchor constraints.

3. Pseudocode Pipeline for Barycentric Distillation

The following outlines the practical pipeline for barycentric feature distillation in mesh deformation (Liu et al., 18 Jan 2026):

Mesh quadric-simplification to a tractable proxy (e.g., $50{,}000$ faces).
Generation of rasterization points and deep features via multi-view camera renders.
Assembly of (3D point, feature) data via barycentric mapping per triangle and per-pixel.
Training of the MLP $\varphi$ to match features at surface points.
Forward evaluation of $\varphi$ on the high-resolution mesh to cache per-vertex features.
Construction of the similarity-based weight matrix $W_{ij}$ , optionally sparsified.
Real-time handle-based deformation through local linear-blend weighted sums.

Extremely high performance is achieved: distillation takes $\sim30$ seconds on $100$ million pixels, feature extraction on a million-vertex mesh requires $\sim0.5$ seconds, and individual edits can be performed in $\sim20$ ms.

4. Semantic Propagation, Symmetry, and Generalization

A core property of barycentric feature distillation is semantic co-deformation: semantically correlated mesh parts, as identified by feature similarity, naturally propagate edits. For example, moving a handle on one chair leg affects all legs similarly without explicit constraints.

Automatic semantic symmetry detection can be performed by reflecting feature fields across candidate planes and measuring cross-reflection feature alignment: $\frac{1}{|V|}\sum_{v\in V^+}\lVert \varphi(v) - \varphi(R_P(v))\rVert_2 + \cdots < \varepsilon$ If satisfied, deformations preserve the inferred symmetry through mirrored handle transforms.

5. Wasserstein Barycentric Feature Distillation for Dataset Compression

In dataset distillation, barycentric feature distillation is realized through computation of free-support Wasserstein barycenters in pretrained feature spaces (Liu et al., 2023). For a class $c$ with $n$ real feature vectors $Z_c=\{z_{c,i}\}$ , the empirical distribution is $\nu_c = \frac{1}{n}\sum_{i=1}^n \delta_{z_{c,i}}$ .

A discrete distribution $\mu_c = \sum_{j=1}^m w_{c,j}\delta_{b_{c,j}}$ (with barycenter features $b_{c,j}$ and weights $w_{c,j}$ ) minimizes the $2$-Wasserstein distance to $\nu_c$ . The alternating optimization proceeds as:

Fix $B$ (support), update $w$ (weights): Solve an optimal transport LP to match mass from real to barycenter features, projected onto the simplex.
Fix $w$ , update $B$ : Gradient Newton updates position barycenter features at the mean of their assigned real features.

Once barycenters are obtained, synthetic images $x_{c,j}$ are optimized so that the embedded features $f_e(x_{c,j})$ land on corresponding barycenter features $b_{c,j}$ , with an auxiliary BatchNorm-matching loss ensuring intra-class variation.

6. Empirical Properties and Efficiency

Barycentric feature distillation achieves:

Real-time evaluation and deformation for meshes with up to 1 million faces, with all steps (distillation, extraction, weight computation) completed in under one minute on commodity hardware (Liu et al., 18 Jan 2026).
State-of-the-art accuracy in dataset distillation, with ImageNet-1K top-1 accuracy reaching 60.7% at 100 images per class, compared to a full-data accuracy of 63.1% (Liu et al., 2023).
Cross-architecture generalization, as synthetic sets distilled for one backbone remain effective for others.

Efficiency derives from the geometric meaningfulness and low-cardinality support of barycenter summaries, as well as the decoupling of distillation from repeated network retraining.

7. Limitations and Future Prospects

Barycentric feature distillation is constrained by the necessity for pretrained deep feature extractors, which may not exist in all domains. Free-support barycenter computation, while efficient, adds extra overhead relative to simpler moment-matching approaches, although the two-step Newton/transport procedure converges in a few hundred iterations. In extreme-compression regimes (e.g., 1 image per class on ImageNet), absolute accuracy remains low, suggesting further metric generalization (e.g., sliced- or Gromov–Wasserstein) as promising future directions (Liu et al., 2023). Extension to self-supervised feature spaces and generative priors is another open topic.

Key References:

"Deep Feature Deformation Weights" (Liu et al., 18 Jan 2026)
"Dataset Distillation via the Wasserstein Metric" (Liu et al., 2023)

Markdown Report Issue Upgrade to Chat

References (2)

Deep Feature Deformation Weights (2026)

Dataset Distillation via the Wasserstein Metric (2023)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Barycentric Feature Distillation.

Barycentric Feature Distillation

1. Deep Feature-to-Geometry Distillation

2. Mathematical Formulation and Optimization

3. Pseudocode Pipeline for Barycentric Distillation

4. Semantic Propagation, Symmetry, and Generalization

5. Wasserstein Barycentric Feature Distillation for Dataset Compression

6. Empirical Properties and Efficiency

7. Limitations and Future Prospects

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Barycentric Feature Distillation

1. Deep Feature-to-Geometry Distillation

2. Mathematical Formulation and Optimization

3. Pseudocode Pipeline for Barycentric Distillation

4. Semantic Propagation, Symmetry, and Generalization

5. Wasserstein Barycentric Feature Distillation for Dataset Compression

6. Empirical Properties and Efficiency

7. Limitations and Future Prospects

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research