Dynamic Gaussians Mesh Framework

Updated 11 November 2025

Dynamic Gaussians Mesh (DG-Mesh) is a framework that uses explicit Gaussian primitives to reconstruct high-fidelity, watertight 3D surface meshes with temporal consistency.
It leverages methods such as 3D Gaussian splatting, generalized exponential functions, and cycle-consistent deformation to ensure precise geometry and smooth transitions across frames.
DG-Mesh supports applications like texture editing, animation, and relightable avatars by enabling efficient rendering and robust mesh extraction.

Dynamic Gaussians Mesh (DG-Mesh) is a framework for reconstructing temporally consistent, watertight 3D surface meshes from unstructured dynamic observations by leveraging explicit Gaussian or generalized exponential primitives and advanced deformation, surface alignment, and anchoring methodologies. DG-Mesh representations enable high-fidelity geometry recovery from moving scenes, scalable and efficient rendering, and facilitate downstream applications such as texture editing, animation, and relightable avatar construction. Multiple mathematical and algorithmic variants have been introduced, including 3D Gaussian Splatting-based DG-Mesh (Liu et al., 18 Apr 2024), Dynamic 2D Gaussians (Zhang et al., 21 Sep 2024), hybrid explicit representations for avatars (Cai et al., 18 Mar 2024), and accelerated generalized exponential splatting for mesh extraction (Zhao et al., 14 Nov 2024). Below, the structural principles, methodologies, and research directions in DG-Mesh are outlined.

1. Primitive-Based Mesh Scene Representations

DG-Mesh representation frameworks use collections of explicit geometric primitives—typically 3D Gaussian ellipsoids or their generalizations—to represent geometry and radiance fields for dynamic scenes.

3D Gaussian Splatting (3DGS): Each primitive is parameterized by center $\mu \in \mathbb{R}^3$ , covariance matrix $\Sigma \in \mathbb{R}^{3 \times 3}$ , radiometric color $c \in \mathbb{R}^3$ or low-order spherical harmonics, and opacity $\alpha \in [0,1]$ . The spatial density is

$G_i(x; \mu_i, \Sigma_i) = \exp\left(-\frac{1}{2}(x-\mu_i)^\top \Sigma_i^{-1}(x-\mu_i)\right),$

with ellipsoids rendering as 2D elliptical splats after projection to the image plane.

Generalized Exponential Splatting (GES): DyGASR adopts a generalized exponential function (GEF) kernel that provides an additional "shape" parameter $\epsilon$ :

$L_i(x) = \exp\left(-\frac{1}{2}[(x-x_i)^\top \Sigma_i^{-1}(x-x_i)]^{\, \epsilon / 2}\right),$

reducing the number of required primitives by increasing localization for sharp features (Zhao et al., 14 Nov 2024).

Dynamic 2D Gaussians: In D-2DGS, each primitive is defined in the local tangent plane of a 3D point, with two orthonormal tangent vectors and two in-plane scales, supporting surface sparsification and enforcement of local geometric consistency (Zhang et al., 21 Sep 2024).

The choice and parameterization of primitive impact how readily surfaces featuring sharp details, fine structures, and motion can be represented, as well as upstream rendering and downstream mesh extraction.

2. Deformation Models and Temporal Consistency

Capturing dynamic scenes requires modeling temporal deformation of primitives in a way that ensures mesh correspondence consistency across frames.

Sparse Control Points and Skinning: D-2DGS and other methods deploy sparse control points $\{p_i\}$ . Time-dependent rigid updates $(R_i^t, T_i^t)$ are predicted for each control point by a lightweight MLP $\Phi(p_i, t)$ . Per-primitive transformation combines these with distance-based blending to yield canonical-to-deformed mappings for positions and orientations. Linear Blend Skinning ensures smooth, plausible geometry evolution over time (Zhang et al., 21 Sep 2024).
Cycle-Consistent Deformation: DG-Mesh frameworks (Liu et al., 18 Apr 2024) perform both forward and backward deformations between canonical and observed frames: a deformation field $\mathcal{F}_f$ warps from canonical to observed; an inverse $\mathcal{F}_b$ aims to undo this operation. A cycle-consistency loss,

$\mathcal{L}_{\text{cycle}} = \sum_{G \in \mathcal{G}_c} \|\mathcal{F}_b(\text{Anchor}(\mathcal{F}_f(G, t)), t) - G\|_1,$

encourages time-consistent anchoring and mesh recovery.

Mesh-Guided Anchoring: After deformation, Gaussian centers are re-anchored to mesh face centroids to regularize distribution, allowing direct one-to-one correspondence between mesh vertices and primitives across frames.

This suite of interlocking deformation, anchoring, and cycle consistency mechanisms ensures the recovered mesh sequences are temporally coherent and physically plausible.

3. Surface Alignment, Regularization, and Densification

High-quality mesh extraction from splatted Gaussians or GEFs relies on enforcing spatial and normal alignment, ensuring coverage, and managing sample density.

Generalized Surface Regularization (GSR): In DyGASR (Zhao et al., 14 Nov 2024), a combination of two loss terms is used:
- SDF Matching Loss
$\mathcal{L}_{\text{sdf}} = \frac{1}{|X|} \sum_{x \in X} |\bar{f}(x) - f(x)|,$

aligning the SDF surrogate from blended GES density to an ideal surface. - Normal-Alignment Loss

$\mathcal{L}_{\text{nor}} = \frac{1}{|X|} \sum_{x \in X} \left\| \frac{\nabla f(x)}{\|\nabla f(x)\|_2} - n_{g^*} \right\|_2^2,$

aligning kernel principal axes perpendicular to the surface.
Mesh-Guided Densification/Pruning: For each mesh face, Gaussians are either merged (if multiple neighbors) or spawned (if no neighbors), with parameters averaged or set by default. An anchoring loss,

$\mathcal{L}_{\text{anchor}} = \frac{1}{N'} \sum_{i=1}^{N'} \|x_i - n_{x_i}\|^2,$

is added to encourage even distribution (Liu et al., 18 Apr 2024).

Sparsity and Regularization: Depth-distortion and normal-consistency losses further discourage floaters and redundant splats (Zhang et al., 21 Sep 2024).

These procedures enable robust mesh formation, allow use of Poisson and marching-cubes for extraction, and minimize floating artifacts.

4. Optimization, Training Curricula, and Pseudocode Workflows

DG-Mesh pipelines incorporate carefully designed optimization strategies and data-processing curricula.

Photometric and Geometry Regularization: Composite loss terms typically include data-fidelity (e.g., $L_1$ + SSIM), SDF or mask matching, Laplacian mesh smoothness, anchoring, and cycle penalties with empirically assigned weights.
Dynamic Resolution Scheduling: DyGASR incrementally increases rendering resolution using a cosine schedule:

$\mathrm{value} = \mathrm{end} + \tfrac{1}{2} (\mathrm{start} - \mathrm{end})\big[ 1 + \cos\left( \frac{\mathrm{iter}}{\mathrm{total}} \pi \right)\big].$

This coarse-to-fine regime smooths optimization and accelerates convergence (Zhao et al., 14 Nov 2024).

Training Workflow Example (DyGASR):

# Initialization
initialize GES kernels from sparse SfM
for iter in 1…MaxIter:
    # Dynamic Resolution
    scale = ...
    set rendering resolution

    # Sample rays, rasterize, compute loss
    get C_pred via splatting
    compute L_rgb

    # GSR and surface alignment every K1 iters
    if iter > warmup and iter % K1 == 0:
        # apply surface losses

    # Backpropagate and step
    optimizer.step()

    # Pruning
    every K2: remove low-alpha kernels, re-compute neighbors
# Mesh Extraction via Poisson

Training times for dynamic mesh recovery typically range from 1–2 hours on a single high-end GPU, with resource usage and scaling improving in recent work (Zhao et al., 14 Nov 2024).

5. Mesh Extraction Pipelines and Vertex Tracking

Mesh extraction is performed after convergence of the splatted primitive field, with downstream mesh tracking made efficient by the consistent anchoring mechanisms.

Surface Extraction Procedures:
- Use Poisson Surface Reconstruction (DPSR) or volumetric TSDF fusion (Open3D) applied to samples projected by camera rays or rendered mask/depth maps.
- Apply standard marching-cubes or differentiable variants to extract watertight surface meshes.
Vertex Tracking: Consistent one-to-one mapping from mesh faces to canonical Gaussian centers is achieved through:
- Mesh-guided anchoring at every frame;
- Use of cycle-consistent deformation to map deformed Gaussians back to their canonical origin.

This tracking allows for temporally stable vertex indices, supporting applications such as coherent texture editing and motion analysis on dynamic surfaces (Liu et al., 18 Apr 2024).

6. Quantitative Evaluation and Comparative Performance

DG-Mesh methods have been evaluated extensively on synthetic, real, and benchmark datasets, compared both with neural implicit surfaces and splat-based baselines.

<table> <thead> <tr> <th>Metric / Dataset</th> <th>DG-Mesh / DyGASR</th> <th>Strongest Baseline</th> </tr> </thead> <tbody> <tr> <td>PSNR (Mip-NeRF360/DeepBlending: avg)</td> <td\>27.57 dB (Zhao et al., 14 Nov 2024), 29.1 dB (Zhang et al., 21 Sep 2024)*</td> <td\>27.28 dB (SuGaR), 27.1 dB (SCGS)</td> </tr> <tr> <td>SSIM</td> <td\>0.831 (Zhao et al., 14 Nov 2024), 2nd best: ≈0.89 (avatars)</td> <td\>0.813 (SuGaR), 0.879 (4DGaussians)</td> </tr> <tr> <td>Chamfer Distance (CD, lower=better, DG-Mesh ds)</td> <td>≈0.85 (Zhang et al., 21 Sep 2024), 0.790 (Liu et al., 18 Apr 2024)</td> <td\>0.934 (D-NeRF), 1.085 (K-Plane)</td> </tr> <tr> <td>Train time (1–2h) / GPU RAM</td> <td\>1.25h & 14GB (Zhao et al., 14 Nov 2024)</td> <td\>1.61h & 21GB (SuGaR)</td> </tr> </tbody> </table>

Qualitatively, DG-Mesh recovers fine geometric details, thin structures, and smooth surfaces with minimal floating artifacts or mesh noise, and maintains time-consistent meshes compatible with real-time rendering pipelines (Liu et al., 18 Apr 2024, Cai et al., 18 Mar 2024).

7. Hybrid Explicit Representations and Extensions

Recent advances incorporate hybrid explicit schemes, especially for real-time head avatars under complex deformation.

Hybrid Mesh + Gaussian: “GauMesh” combines UV-mapped 3D mesh with 3DGS, employing synchronized rendering and $\alpha$ -blending for photorealistic texturing of surfaces and explicit modeling of complex geometry (hair, wrinkles) (Cai et al., 18 Mar 2024). Per-pixel depth sorting merges mesh fragments and splats for accurate compositing,

$C(u) = T' \alpha' c' + \sum_{k=1}^K \mathbbm{1}(d', d_k, \alpha') T_k \alpha_k c_k,$

and two-pass GPU algorithms ensure real-time throughput at high resolutions.

Role in Avatars and Relighting: Such hybrid models facilitate the generation of digital avatars with temporally consistent mesh sequences, ultra-sharp textures for skin, and complex geometric detail, streamlining videogame, VR, and cinematic applications.

A plausible implication is that further generalizations of explicit splatting representations, combined with advanced deformation and hybridization, will continue to improve dynamic scene modeling and downstream mesh-based manipulations.

Dynamic Gaussians Mesh representations formalize a rigorous, explicit, and temporally coherent approach to mesh sequence extraction from dynamic scenes, enabling high-fidelity, memory-efficient surface representations that bridge classical graphics pipelines with modern differentiable rendering and radiance field methodologies. Applications span real-time avatars, virtual object tracking, and physically based animation, with current research focused on increased fidelity, efficiency, and robustness (Zhao et al., 14 Nov 2024, Liu et al., 18 Apr 2024, Zhang et al., 21 Sep 2024, Cai et al., 18 Mar 2024).