Deformable Gaussian Splatting

Updated 24 March 2026

Deformable Gaussian Splatting is a dynamic scene representation technique that employs time-dependent Gaussian primitives to model articulated and non-rigid motions with high fidelity.
It parameterizes each Gaussian’s position, scale, orientation, and opacity using learnable deformation fields to enable real-time reconstruction and rendering.
The approach is applied in surgical reconstruction, animatable avatars, and wireless radiance fields, demonstrating versatility in handling complex dynamic scenes.

Deformable Gaussian Splatting is a class of explicit dynamic scene representations in which 3D (or 2D) Gaussian primitives are spatially and temporally modulated to model non-rigid or articulated motion for high-fidelity, real-time reconstruction and rendering. This paradigm generalizes static 3D Gaussian Splatting by extending each primitive's parameters—position, scale, orientation, opacity, and appearance coefficients—via learnable, often highly-parameterized deformation fields. Deformable Gaussian Splatting frameworks achieve state-of-the-art quality in tasks such as surgical scene reconstruction, dynamic avatar creation, in-the-wild radiance field synthesis, and even wireless radiance field modeling, offering real-time performance and modeling flexibility unattainable with implicit volumetric fields.

1. Fundamentals of Deformable Gaussian Splatting

A static 3D Gaussian splatting system represents a scene as a collection of $N$ anisotropic Gaussians, each defined by

$G_i(x) = \exp\!\left(-\tfrac{1}{2}(x-\mu_i)^\top\Sigma_i^{-1}(x-\mu_i)\right),$

where $\mu_i\in\mathbb{R}^3$ is the mean (center), and $\Sigma_i=R_i S_i^2 R_i^\top\in\mathbb{R}^{3\times 3}$ is the covariance (factorized into scale $S_i$ and rotation $R_i$ ). Each Gaussian also carries an opacity $\sigma_i\in\mathbb{R}^+$ and view-dependent color coefficients $c_i$ (frequently using spherical harmonics).

Rendering involves projecting each Gaussian to screen space according to the camera/world transformation, calculating 2D projected covariances, modulating opacities for each pixel, and accumulating color and depth via front-to-back alpha blending: $\hat C(p)=\sum_{i=1}^N c_i\,\alpha_i(p)\prod_{j<i}(1-\alpha_j(p)),$ where $\alpha_i(p)$ is the projected and attenuated opacity at pixel $p$ .

Deformation is introduced by expanding the parameter space of each Gaussian with time-dependent transformations. Canonical parameter sets $\{\mu_{i,0}, S_{i,0}, R_{i,0}, \sigma_{i,0}, c_{i,0}\}$ become functions of time or motion descriptors, resulting in a temporally coherent, spatially explicit primitive-based field.

2. Deformation Modeling Approaches

2.1 Per-Gaussian Life Cycles and Gaussian Basis Expansions

In high-speed, topology-changing domains such as surgical scenes, explicit "life cycles" are assigned to each Gaussian. Opacity and deformation are represented as learned basis expansions over time, e.g., for opacity: $\alpha_i(t) = \alpha_{i,0} + \sum_{j=1}^B \omega^{\alpha}_{i,j} b^\alpha_j(t), \qquad b^\alpha_j(t) = \exp\left(-\tfrac{1}{2\sigma_{i,j}^{\alpha 2}}(t-\theta_{i,j}^\alpha)^2\right),$ allowing Gaussians to disappear post-shearing (irreversible deformations) (Shan et al., 2 Jan 2025). All deformation parameters—position, rotation (in quaternion or continuous 6D representations), and scale—are similarly parameterized with individualized temporal basis functions. No auxiliary regularizers beyond reconstruction and, if relevant, ranking losses are imposed.

2.2 Embedding-Based Deformations and Coarse-to-Fine Temporal Decoding

Per-Gaussian embedding methods introduce a dedicated learned feature $e_i$ for each primitive, with global or multi-resolution temporal embeddings $t_{c}(t)$ and $t_{f}(t)$ providing coarse/fine temporal context. Deformation decoders $F_c, F_f$ model both slow and rapid motions: $\Delta x_i^c = F_c^x(e_i, t_c(t));\quad \Delta x_i^f = F_f^x(e_i, t_f(t));\quad x_i(t) = x_i + \Delta x_i^c + \Delta x_i^f$ (Bae et al., 2024, Jiao et al., 21 Mar 2026). This decouples local motion (per-Gaussian) from global temporal dynamics, enabling robust handling of spatially distinct dynamic regions.

2.3 Physics-Informed and Geometry-Aware Deformations

In scenarios requiring physically plausible motion, e.g., modeling tissue or material response, physics-informed losses such as the Cauchy momentum residual are enforced: $\mathcal{L}_{\text{CMR}} = \frac{1}{M}\sum_{i=1}^M \|\rho(\partial_t\mathbf{v}_i + (\mathbf{v}_i\cdot\nabla)\mathbf{v}_i) - \nabla\cdot\sigma_i\|_2^2$ and each Gaussian is treated as a Lagrangian material point with constitutive properties predicted by a material field MLP (Hong et al., 9 Nov 2025). In other works, geometry-aware constraints are introduced by augmenting deformation networks with features from 3D convolutional UNets operating on the canonical point cloud, ensuring local structure-aware motion and 3D consistency (Lu et al., 2024).

2.4 Hybrid and Anchor-Based Models

To avoid redundancy in per-Gaussian deformation, anchor-based approaches group Gaussians and apply shared, small deformation MLPs to each anchor, propagating coarse deformations to primitives and applying fine per-Gaussian corrections as residuals. This yields significant acceleration and compression while maintaining fidelity (Huang et al., 13 May 2025, Tu et al., 9 Jun 2025). GroupFlow (Tu et al., 9 Jun 2025) further accelerates by clustering Gaussians with similar motion trajectories and applying shared rigid transformations.

3. Acceleration Strategies and Hierarchical Motion

Real-time performance is addressed via several algorithmic innovations:

Adaptive Motion Hierarchies (AMHS): The scene is partitioned into spatial blocks, each tracked as static or dynamic via deformation magnitude and rendering loss criteria. Only Gaussians in actually dynamic blocks are warped, reducing per-frame computation and neural inference (Shan et al., 2 Jan 2025).
Temporal Sensitivity Pruning: Hessian-based sensitivity scores track reconstruction impact over all frames; low-contribution Gaussians are pruned, keeping only primitives essential to time-varying content (Tu et al., 9 Jun 2025).
Group-Based Transformations: Rather than per-Gaussian neural calls, rigid or affine transformations are fitted per group (via e.g., Umeyama alignment), shared among many Gaussians, resulting in $>10\times$ speed and memory savings (Tu et al., 9 Jun 2025, Huang et al., 13 May 2025).
Error-Guided Densification: Anchor injection based on error maps and robust depth allows rapid filling of poorly modeled regions in the canonical field, minimizing the deformation network’s burden (Jiao et al., 21 Mar 2026).

4. Optimization Objectives and Loss Functions

Photometric reconstruction loss forms the core supervision, often augmented as: $\mathcal{L}_C = \|(1-M)\odot(\hat C - C)\|_1$ with $M$ as a mask (e.g., tool, occlusion). Depth supervision via ground-truth or stereo is used where available. Auxiliary ranking or regularization losses (e.g., for correct depth order or opacity constraints) are incorporated: $\mathcal{L}_\text{rank},\quad \mathcal{L}_\text{opacity} = -\sum_i \sigma_i\log\sigma_i$ Physics-informed and geometry-aware methods add terms such as $\mathcal{L}_{\text{CMR}}$ or geometry structure penalties. Local smoothness or Laplacian regularization on embeddings may be introduced to prevent spatial artifacts (Bae et al., 2024, Jiao et al., 21 Mar 2026).

Training proceeds with standard optimizers (Adam), learning rate schedules, and staged or hierarchical optimization (first train static field, then enable deformation modules after a burn-in period).

5. Quantitative Results and Practical Performance

State-of-the-art deformable Gaussian Splatting methods such as EH-SurGS achieve

PSNR $\approx$ 39–42 dB, SSIM $>$ 0.97, LPIPS $<$ 0.04,
Rendering speeds $>$ 350 FPS (surgical scenes),
Training times $<$ 2 minutes per scene (Shan et al., 2 Jan 2025, Yang et al., 2024).

Ablation studies underline the necessity of per-Gaussian life cycles for handling irreversible dynamics (e.g., post-shearing), adaptive hierarchies for speed-up with no loss of accuracy, and error-based anchor injection for robust reconstruction in sparse/occluded regions (Shan et al., 2 Jan 2025, Jiao et al., 21 Mar 2026).

6. Applications and Extensions Across Modalities

Deformable Gaussian Splatting is applied to a diverse set of domains:

Surgical and Endoscopic Scene Reconstruction: Robust, high-fidelity, and real-time volumetric reconstructions overcoming tissue deformation, tool occlusion, and irreversible topology changes (Shan et al., 2 Jan 2025, Chen et al., 2024, Zhu et al., 2024, Yang et al., 2024).
Animatable Human Avatars: Embedding explicit Gaussian deformation in kinematic (LBS or SMPL-driven) templates for high-speed animatable avatars, avoiding intensive MLP radiance field queries and enabling clothed and pose-aware modeling (Qian et al., 2023, Jung et al., 2023).
General Dynamic Scene Synthesis: Monocular and multi-view capture, with dynamic neural surfaces, anchor-driven motion, and error-guided densification for broad dynamic content (Li et al., 2024, Huang et al., 13 May 2025, Bae et al., 2024, Jiao et al., 21 Mar 2026).
Physics-Driven Material Modeling: Unified scene and dynamics reconstruction constrained by physical laws (momentum equations) and optical flow, for physically meaningful non-rigid motion (Hong et al., 9 Nov 2025).
Wireless Field Modeling, Deformable Object Tracking: Extension to 2D Gaussian splatting for wireless radiance fields and physics-informed tracking of deformable linear objects (e.g., ropes) in robotics (Liu et al., 15 Jun 2025, Dinkel et al., 13 May 2025).

7. Limitations and Future Research Directions

While deformable Gaussian Splatting has led to step changes in reconstruction fidelity, rendering speed, and modeling flexibility, several challenges remain:

Spatial Smoothness: Purely per-Gaussian or group-based deformations can lead to discontinuities between adjacent Gaussians; explicit spatial regularization is an area for advancement (Jiao et al., 21 Mar 2026, Bae et al., 2024).
Topological Change: Existing frameworks are limited by continuity within a fixed set of Gaussians; real-time detection and (re-)initialization on topological events (e.g., cutting/shearing) are open research directions (Shan et al., 2 Jan 2025).
Physics Model Expressiveness: Physics-informed approaches are limited to continuum mechanics, lacking higher-order or plasticity modeling which may be required for complex materials (Hong et al., 9 Nov 2025).
Sparse and Ambiguous Data Handling: Scene initialization from sparse point clouds or ambiguous monocular data remains sensitive; robust error-guided densification and hybrid depth/model priors are under exploration (Jiao et al., 21 Mar 2026, Tu et al., 9 Jun 2025).
Model Compression and Rate–Distortion: New codecs and optimization frameworks provide 30–200× storage reduction with negligible quality loss, but further reduction and on-device real-time inference remain targets (Huang et al., 13 May 2025).

Deformable Gaussian Splatting continues to unify explicit, interpretable, and physically-plausible dynamic scene modeling with the computational advantages of real-time rasterization, expanding its reach from vision to physics-based simulation and communications (Shan et al., 2 Jan 2025, Jiao et al., 21 Mar 2026, Hong et al., 9 Nov 2025, Tu et al., 9 Jun 2025).