Papers
Topics
Authors
Recent
Search
2000 character limit reached

Barycentric Feature Distillation

Updated 25 January 2026
  • The paper introduces barycentric feature distillation, a method that transfers deep semantic features onto 3D meshes using barycenter-based interpolation for precise, real-time deformations.
  • It leverages multi-view deep feature extraction and a lightweight MLP to efficiently map image-derived features to a continuous 3D field, independent of mesh topology.
  • The technique also extends to dataset distillation with Wasserstein barycenters, achieving competitive accuracy while enabling extreme data compression and cross-architecture generalization.

Barycentric Feature Distillation refers to a class of techniques for summarizing or transferring information in high-dimensional feature spaces by leveraging barycentric coordinates or barycenter constructions, often in the context of deep learning for 3D shape editing and dataset distillation. Two main instantiations have driven its prominence: the barycentric feature distillation pipeline for high-resolution, semantically regularized mesh deformation (Liu et al., 18 Jan 2026), and the Wasserstein barycentric feature distillation framework for dataset distillation (Liu et al., 2023). Both employ barycentric or barycenter-based constructions to align embedded representations efficiently, enabling semantically structured editing or highly compressed dataset synthesis.

1. Deep Feature-to-Geometry Distillation

Handle-based mesh deformation, such as ARAP or biharmonic coordinates, provides efficient and precise shape manipulation but lacks semantic awareness. Barycentric feature distillation bridges the gap between such geometric frameworks and the semantic priors encoded in modern vision networks by distilling deep 2D features into a continuous 3D field over the mesh (Liu et al., 18 Jan 2026).

Given a 3D mesh M=(V,F)M=(V,F) and a pretrained 2D feature extractor F2DF_{\mathrm{2D}} (e.g., DINOv2), the process is as follows:

  • The mesh is rendered from diverse viewpoints, producing RGB images.
  • Deep features Zij=F2D(I)ijZ_{ij} = F_{\mathrm{2D}}(I)_{ij} are extracted for each camera pixel.
  • Using triangle rasterization, every camera pixel inside a mesh triangle is mapped onto the 3D surface using barycentric weights λ=(λa,λb,λc)\lambda=(\lambda_a,\lambda_b,\lambda_c) with Pij=tλtvtP_{ij} = \sum_t \lambda_t v_t.
  • A small MLP φ:R3Rd\varphi:\mathbb{R}^3\to\mathbb{R}^d is trained so that φ(Pij)\varphi(P_{ij}) matches the normalized deep feature Zij/ZijZ_{ij}/\lVert Z_{ij} \rVert at that point.
  • Distillation complexity depends only on image resolution, not mesh topology, allowing real-time field recovery even for meshes with up to 10610^6 faces.

This continuous feature field, φ(x)\varphi(x), enables immediate evaluation of semantic features at any mesh vertex, providing a direct link from image-based semantics to geometric manipulation.

2. Mathematical Formulation and Optimization

The fitting objective for barycentric feature distillation is constructed as a per-pixel loss over rasterized points: Edistill(φ)=(i,j)Ωφ(Pij)Z^ij22E_{\mathrm{distill}}(\varphi) = \sum_{(i,j)\in\Omega} \lVert \varphi(P_{ij}) - \hat{Z}_{ij}\rVert_2^2 where Ω\Omega is the set of all rendered pixels covering mesh faces. The MLP φ\varphi is optimized using Adam over batches of (Pij,Z^ij)(P_{ij}, \hat{Z}_{ij}) pairs (Liu et al., 18 Jan 2026).

To map features back to deformation weights, feature proximity is used: Wij=max{Fsim(Zvi,Zvj),0},Fsim(u,v)=1uv2W_{ij}=\max\{F_{\mathrm{sim}}(Z_{v_i}, Z_{v_j}), 0\}, \quad F_{\mathrm{sim}}(u,v)=1-\lVert u-v\rVert_2 where Zvi=φ(vi)Z_{v_i}=\varphi(v_i) are per-vertex features after distillation. Subsequent handle-based deformations are applied using these semantically informed weights with classical linear blend skinning, allowing O(nK)O(nK) run-time complexity for nn vertices and KK handles.

Optional geometric post-processing includes locality weighting by normalized geodesic distance and feature-anchor constraints.

3. Pseudocode Pipeline for Barycentric Distillation

The following outlines the practical pipeline for barycentric feature distillation in mesh deformation (Liu et al., 18 Jan 2026):

  1. Mesh quadric-simplification to a tractable proxy (e.g., 50,00050{,}000 faces).
  2. Generation of rasterization points and deep features via multi-view camera renders.
  3. Assembly of (3D point, feature) data via barycentric mapping per triangle and per-pixel.
  4. Training of the MLP φ\varphi to match features at surface points.
  5. Forward evaluation of φ\varphi on the high-resolution mesh to cache per-vertex features.
  6. Construction of the similarity-based weight matrix WijW_{ij}, optionally sparsified.
  7. Real-time handle-based deformation through local linear-blend weighted sums.

Extremely high performance is achieved: distillation takes 30\sim30 seconds on $100$ million pixels, feature extraction on a million-vertex mesh requires 0.5\sim0.5 seconds, and individual edits can be performed in 20\sim20 ms.

4. Semantic Propagation, Symmetry, and Generalization

A core property of barycentric feature distillation is semantic co-deformation: semantically correlated mesh parts, as identified by feature similarity, naturally propagate edits. For example, moving a handle on one chair leg affects all legs similarly without explicit constraints.

Automatic semantic symmetry detection can be performed by reflecting feature fields across candidate planes and measuring cross-reflection feature alignment: 1VvV+φ(v)φ(RP(v))2+<ε\frac{1}{|V|}\sum_{v\in V^+}\lVert \varphi(v) - \varphi(R_P(v))\rVert_2 + \cdots < \varepsilon If satisfied, deformations preserve the inferred symmetry through mirrored handle transforms.

5. Wasserstein Barycentric Feature Distillation for Dataset Compression

In dataset distillation, barycentric feature distillation is realized through computation of free-support Wasserstein barycenters in pretrained feature spaces (Liu et al., 2023). For a class cc with nn real feature vectors Zc={zc,i}Z_c=\{z_{c,i}\}, the empirical distribution is νc=1ni=1nδzc,i\nu_c = \frac{1}{n}\sum_{i=1}^n \delta_{z_{c,i}}.

A discrete distribution μc=j=1mwc,jδbc,j\mu_c = \sum_{j=1}^m w_{c,j}\delta_{b_{c,j}} (with barycenter features bc,jb_{c,j} and weights wc,jw_{c,j}) minimizes the $2$-Wasserstein distance to νc\nu_c. The alternating optimization proceeds as:

  • Fix BB (support), update ww (weights): Solve an optimal transport LP to match mass from real to barycenter features, projected onto the simplex.
  • Fix ww, update BB: Gradient Newton updates position barycenter features at the mean of their assigned real features.

Once barycenters are obtained, synthetic images xc,jx_{c,j} are optimized so that the embedded features fe(xc,j)f_e(x_{c,j}) land on corresponding barycenter features bc,jb_{c,j}, with an auxiliary BatchNorm-matching loss ensuring intra-class variation.

6. Empirical Properties and Efficiency

Barycentric feature distillation achieves:

  • Real-time evaluation and deformation for meshes with up to 1 million faces, with all steps (distillation, extraction, weight computation) completed in under one minute on commodity hardware (Liu et al., 18 Jan 2026).
  • State-of-the-art accuracy in dataset distillation, with ImageNet-1K top-1 accuracy reaching 60.7% at 100 images per class, compared to a full-data accuracy of 63.1% (Liu et al., 2023).
  • Cross-architecture generalization, as synthetic sets distilled for one backbone remain effective for others.

Efficiency derives from the geometric meaningfulness and low-cardinality support of barycenter summaries, as well as the decoupling of distillation from repeated network retraining.

7. Limitations and Future Prospects

Barycentric feature distillation is constrained by the necessity for pretrained deep feature extractors, which may not exist in all domains. Free-support barycenter computation, while efficient, adds extra overhead relative to simpler moment-matching approaches, although the two-step Newton/transport procedure converges in a few hundred iterations. In extreme-compression regimes (e.g., 1 image per class on ImageNet), absolute accuracy remains low, suggesting further metric generalization (e.g., sliced- or Gromov–Wasserstein) as promising future directions (Liu et al., 2023). Extension to self-supervised feature spaces and generative priors is another open topic.


Key References:

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Barycentric Feature Distillation.