3D Gaussian Forward Skinning

Updated 21 March 2026

3D Gaussian Forward Skinning is a volumetric deformation paradigm that uses canonical 3D Gaussian primitives and analytic kinematics (like LBS) to animate detailed avatars.
It maps Gaussian attributes from rest-pose to dynamic configurations via skinning weights and refined MLP modules, ensuring accurate covariance and orientation transfer.
The framework underpins real-time rendering, relightable avatars, and articulated object modeling, delivering measurable performance gains and improved image quality.

3D Gaussian Forward Skinning is a volumetric deformation paradigm underpinning modern articulated, animatable, and relightable representations for humans and articulated objects. The framework connects canonical-space 3D Gaussian primitives to dynamic, observation-space configurations via analytic kinematic models such as linear blend skinning (LBS) or related mechanisms. By enabling efficient, differentiable rendering, high-fidelity animation, and real-time performance, 3D Gaussian Forward Skinning has become foundational in contemporary avatar construction, virtual reality, and articulated object modeling (Zhan et al., 2024, Hu et al., 2023, Li et al., 25 Jun 2025, Zioulis et al., 14 Sep 2025, Li et al., 2024, Zielonka et al., 2023, Tian et al., 29 Apr 2025, Liu et al., 26 Feb 2025, Yao et al., 21 Mar 2025, Wu et al., 4 Feb 2026).

1. Canonical 3D Gaussian Representation

3D Gaussian Forward Skinning begins with a discrete set of anisotropic Gaussian primitives parameterized in a canonical (typically rest-pose) frame. Each Gaussian is specified by:

Center $\mu \in \mathbb{R}^3$ ,
Anisotropic 3D covariance $\Sigma \in \mathbb{R}^{3\times3}$ (typically $\Sigma = R\,\mathrm{diag}(\sigma^2)\,R^\top$ for scale/rotation),
Unit quaternion or $3 \times 3$ orthogonal matrix for orientation,
Per-axis scale $\sigma \in \mathbb{R}_+^3$ ,
Volumetric opacity $\alpha \in (0,1)$ ,
Color (RGB or spherical-harmonic expansion) $c$ ,
Other optional physical/appearance/BRDF parameters.

The canonical space arrangement is often seeded from SMPL/SMAL mesh vertices, mesh-volume splats, or custom tessellation, and attributes can be initialized by barycentric or inverse-distance interpolation (Zhan et al., 2024, Hu et al., 2023, Li et al., 25 Jun 2025).

The 3D Gaussian’s spatial density is

$G(x) = \alpha \exp\left(-\tfrac{1}{2}(x-\mu)^\top \Sigma^{-1} (x-\mu)\right)$

for each primitive. Covariance is maintained positive-definite by parameterization via eigen-decomposition or explicit rotation-scale factors.

2. Forward Skinning: Kinematic Deformation to Posed Space

To animate Gaussian clouds, means and covariances are mapped to pose space via kinematic transformations defined by underlying skeletal models. The prevalent mechanism is Linear Blend Skinning (LBS), with extensions for non-linear, part-based, or cage-based deformations.

For a skeleton with $B$ joints and forward kinematic transforms $T_b \in SE(3)$ :

Each Gaussian is assigned a set of skinning weights $w_{i,b}$ (directly, or via interpolation from mesh/texture).
The posed mean is

$\bar{\mu}_i = \sum_{b=1}^{B} w_{i,b}\left(R_b\,\mu_i + t_b\right)$

The posed covariance (for rigid LBS) is

$\bar{\Sigma}_i = \sum_{b=1}^{B} w_{i,b} R_b \Sigma_i R_b^\top$

where $R_b$ and $t_b$ denote the rotation and translation components of $T_b$ (Hu et al., 2023, Li et al., 25 Jun 2025).

Cage-based and part-based variants apply affine maps derived from local deformation gradients (e.g., tetrahedral cages in (Zielonka et al., 2023), part dynamics in (Liu et al., 26 Feb 2025)), enabling explicit modeling of stretching, shearing, and cloth sliding absent from pure LBS.

Covariance and orientation transfer require special attention: naïve linear blending of rotations is typically invalid for $SO(3)$ . Weighted quaternion averaging enforces proper rigid-body rotation transfer for Gaussian orientation (Zioulis et al., 14 Sep 2025, Li et al., 2024, Zhan et al., 2024). The resulting rotation-applied covariance:

$\Sigma_i = R(q_i) \mathrm{diag}(\sigma_i^2) R(q_i)^\top$

where $q_i$ is the blended quaternion.

For accurate deformation and relighting, per-Gaussian attributes are interpolated from skeletal or mesh sources, often using $k$ -nearest neighbors with inverse-distance weights:

$*_g = \frac{ \sum_{j \in S_g} \frac{1}{\|x_g - x^j\|} *^j }{ \sum_{j \in S_g} \frac{1}{\|x_g - x^j\|} }$

applied to orientation, scales, normals, visibility, and other attributes (Zhan et al., 2024).

Refinement modules, typically MLPs, correct for imprecise mesh-driven weights or bone misalignments. Pose and LBS-weight refinement networks are trained end-to-end under rendering losses (Hu et al., 2023, Tian et al., 29 Apr 2025). Pose optimization via backpropagation improves geometric consistency in ambiguous or low-visibility regions (Li et al., 2024, Hu et al., 2023).

Weight initialization strategies range from inheriting mesh vertex SMPL weights, learning from part-based distances (as in coarse-to-fine clustering in (Liu et al., 26 Feb 2025)), to explicit optimization against task losses.

4. Rendering, Relighting, and Real-Time Performance

The posed 3D Gaussians are projected to camera space using analytic Jacobians, yielding 2D ellipses splatted with an efficient, tile-based GPU rasterizer (Zhan et al., 2024, Hu et al., 2023, Li et al., 25 Jun 2025). Rendering incorporates radiometric attributes:

Environment lightmaps or area-light probes integrated with learned per-Gaussian BRDF parameters enable physically-based, relightable image synthesis (Zhan et al., 2024).
Shadowing is efficiently handled via mesh/vertex-based rasterization techniques, with visibility transferred to Gaussians through the same interpolation as for other attributes (Zhan et al., 2024).

Losses imposed during optimization include photometric, perceptual, material smoothness, geometric regularization (preventing over-smoothing or collapse), and scale constraints.

Real-time performance is achieved by exploiting splat-based rendering (as opposed to ray-based NeRF approaches), efficient KNN updates, pruning, and hierarchical splitting/merging based on gradients and KL divergence (Hu et al., 2023, Li et al., 25 Jun 2025). Reported rendering speeds exceed 100–200 FPS for dynamic avatars (Li et al., 25 Jun 2025, Hu et al., 2023), 6.9 FPS including full shadow computation (Zhan et al., 2024), and greater than 47 FPS when shadow computation is omitted.

5. Extensions to Part-Based, Non-Rigid, and Cage Models

While standard LBS handles rigid-body articulation, many avatars display complex non-rigid deformations. Cage-based models (e.g., tetrahedral cages in (Zielonka et al., 2023)), directly warp both mean and full covariance by local affine transformations $J_i$ :

$\Sigma' = J_i \Sigma J_i^\top$

This enables direct modeling of stretching/shear in garments, facial features, or animal appendages.

Part-based approaches (as in (Liu et al., 26 Feb 2025)) allocate distinct Gaussian sets to independently articulated parts, with skinning weights determined via soft Mahalanobis-based assignment or Gumbel-Softmax. Forward skinning in such modules generalizes LBS, blending per-part rigid transformations and handling both first and second-order moments of the Gaussians. Multilayered compositional pipelines further increase model expressiveness and editing capabilities (Zielonka et al., 2023, Liu et al., 26 Feb 2025).

Hybrid models combine rigid LBS with non-rigid residual deformation fields (e.g., hexplane-MLP corrections in (Wu et al., 4 Feb 2026)), achieving a trade-off between interpretability, editability, and reconstruction fidelity.

6. Applications, Limitations, and Performance Analysis

3D Gaussian forward skinning is now a foundational methodology for relightable avatars, clothed human reconstruction, articulated animal modeling, dynamic background-foreground separation, and articulated part reconstruction. Notable reported results include:

Quantitative improvements over deformation-Mesh-based NeRFs, with >0.2–0.4 dB PSNR increase, 0.5–1% SSIM gain, and 5–10% LPIPS reduction via proper quaternion-averaged rotation (Zioulis et al., 14 Sep 2025).
Rendering rates as high as 189 FPS for avatars using ∼13k articulated Gaussians (Hu et al., 2023), performance scaling linearly with Gaussian count and shadowing requirements (Zhan et al., 2024).
Improved geometric fidelity and reduced memory via 2D surfel-aligned forward skinning for the human body (Tian et al., 29 Apr 2025).
Single pipeline adapts seamlessly to animal scene reconstruction with accurate posed models (SMAL) (Li et al., 25 Jun 2025).

Identified limitations of standard LBS include invalid rotation transfer for anisotropic covariances (necessitating weighted quaternion blending), inability to express highly non-rigid local effects (addressed by cage, MLP, or residual refinement), and challenges in balancing geometric density versus computational resources. A plausible implication is that future systems will further hybridize analytic kinematic models with learned local deformation fields for optimal performance and editability.

7. Summary Table: Variants and Core Equations

Model/Framework	Forward-Skinning Equation(s)	Notable Extensions/Features
(Zhan et al., 2024)	LBS of position and covariance, interp. all attributes	Volumetric mesh seeding, explicit relighting and fast shadowing
(Hu et al., 2023)	$\mu^{\rm posed} = \sum_j w_{i,j}T_j(\mu_i)$	MLP-based pose/LBS refinement, KL-split/merge, tile splatting
(Li et al., 25 Jun 2025)	LBS applied to mesh+offset rep., CNN for UV attributes	PoP predictor CNN, automatic point density via texture
(Zioulis et al., 14 Sep 2025)	Quaternion-averaged rotation for SO(3)-covariance	Algebraic correction to LBS, 0.2–0.4 dB PSNR increase
(Li et al., 2024)	LBS with local rigidity/isometry priors	Joint optimization of pose/Gaussian params, split-with-scale
(Zielonka et al., 2023)	Cage affine J: $\Sigma' = J\Sigma J^\top$	Per-part tetrahedral cages, MLP Gaussian corrections
(Tian et al., 29 Apr 2025)	LBS of 2D Gaussians (surfels)	Pose calibration/weight correction MLPs, 17% fewer Gaussians
(Liu et al., 26 Feb 2025)	$\mu'_j=\sum_i w_{ij}(R_i\mu_j + t_i)$	Unsupervised part binding, Gumbel-Softmax, dynamic part clusters
(Wu et al., 4 Feb 2026)	LBS then nonrigid hexplane-MLP correction	Editable 4D avatars, skeleton editing in real time
(Yao et al., 21 Mar 2025)	Node-based learned skinning, residual pose MLP	Skeleton extraction, ARAP, pose-aware detail MLP

References

Interactive Rendering of Relightable and Animatable Gaussian Avatars (Zhan et al., 2024)
GauHuman: Articulated Gaussian Splatting from Monocular Human Videos (Hu et al., 2023)
SkinningGS: Editable Dynamic Human Scene Reconstruction Using Gaussian Splatting Based on a Skinning Model (Li et al., 25 Jun 2025)
On the Skinning of Gaussian Avatars (Zioulis et al., 14 Sep 2025)
GaussianBody: Clothed Human Reconstruction via 3d Gaussian Splatting (Li et al., 2024)
Drivable 3D Gaussian Avatars (Zielonka et al., 2023)
EfficientHuman: Efficient Training and Reconstruction of Moving Human using Articulated 2D Gaussian (Tian et al., 29 Apr 2025)
ArtGS: Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting (Liu et al., 26 Feb 2025)
RigGS: Rigging of 3D Gaussians for Modeling Articulated Objects in Videos (Yao et al., 21 Mar 2025)
SkeletonGaussian: Editable 4D Generation through Gaussian Skeletonization (Wu et al., 4 Feb 2026)