Differentiable Gaussian Surfel Mapping

Updated 2 December 2025

The paper presents a novel differentiable mapping technique that leverages Gaussian surfels and closed-form derivatives for precise 3D scene reconstruction.
It utilizes anisotropic surfel parameterization and differentiable splatting to optimize geometry, appearance, and sensor pose in an end-to-end trainable pipeline.
The framework achieves scalable performance in SLAM, relighting, and semantic mapping, outperforming prior methods in both fidelity and runtime efficiency.

Differentiable Gaussian Surfel Mapping is a class of algorithms and mathematical frameworks in 3D computer vision and graphics that represent, render, and optimize scenes using collections of surface-oriented, spatially localized Gaussian primitives (“surfels”). These methods combine the geometric and computational advantages of 2D surface elements (surfels) with the analytic differentiability enabled by parametric Gaussian kernels, yielding highly accurate, end-to-end trainable pipelines for surface reconstruction, SLAM, relighting, and multimodal scene understanding. The differentiable approach enables robust gradient-based optimization of geometry, appearance, and pose via backpropagation through the rendering process.

1. Surfel Parameterization and Scene Representation

Gaussian surfels extend classical surfel representations by employing anisotropic 2D Gaussian kernels as surface elements in 3D space. Each surfel is parameterized by a 3D center $p_i \in \mathbb{R}^3$ , a covariance matrix or principal axes $\Sigma_i$ (often rank-2, aligned with the tangent plane), an opacity or weight $o_i \in [0,1]$ , and surface appearance coefficients $c_i$ (typically RGB color or spherical harmonics). Some frameworks further enrich attributes with latent appearance codes for non-Lambertian materials, semantic logits, or learnable pruning weights (Jiang et al., 26 Nov 2024, Xie et al., 14 Oct 2025, Pan et al., 1 Dec 2025).

Surfels explicitly encode local geometry: the mean $p_i$ is located on the underlying surface, $\Sigma_i$ captures local tangential uncertainty and extent, and surface normals are directly inferred from principal axes. The formulation supports both dense and sparse surfel distributions, multi-view fusion, and integration of sensor uncertainty in mapping workflows (Pan et al., 1 Dec 2025, Fan et al., 28 Jul 2025).

2. Differentiable Rendering and Splatting Algorithms

Rendering with Gaussian surfels is fundamentally a “splatting” operation: each surfel projects to the image plane as a 2D Gaussian, and its contribution is composited into pixel values via differentiable alpha blending or transmittance accumulation. Color, depth, normal, and other modalities are generated by front-to-back compositing, typically ordered by increasing surfel depth from the camera:

Color blending example:

$\mathbf{C}(u) = \sum_{i} T_i\,\alpha_i(u)\,c_i, \quad T_i = \prod_{j<i}(1-\alpha_j(u))$

with $\alpha_i(u)$ the splat opacity at pixel $u$ (Fan et al., 28 Jul 2025, Pan et al., 1 Dec 2025).

Exact opacity and self-attenuation:

Some methods replace Taylor-approximate alpha compositing with analytic transmittance via cumulative sums of surfel contributions, ensuring physically correct self-occlusion and improved differential calculus for gradient-based learning (Jiang et al., 26 Nov 2024).

Depth and normal rendering:

Depth maps can be reconstructed by weighted sums or through exact analytic ray–ellipsoid intersection, with normals calculated from local geometry or finite differences of the depth field (Xie et al., 14 Oct 2025).

All core rendering steps are constructed from explicit, closed-form or easily autodifferentiable functions, enabling stable, efficient computation of gradients with respect to surfel parameters, camera pose, and appearance. Backpropagation flows through the soft assignments produced by the Gaussian kernels, allowing seamless integration into deep learning pipelines.

3. Geometry Optimization and Multi-View Fusion

Differentiable Gaussian surfel mapping frameworks jointly optimize surfel parameters, camera pose trajectories, and appearance by minimizing data-fidelity and geometric losses over observed image streams:

Primary losses:
- Photometric: $\mathcal{L}_c = \sum_{u} \|\hat C(u) - C_{\mathrm{gt}}(u)\|$ or SSIM-based.
- Depth: $\mathcal{L}_d = \sum_{i<j} \omega_i\omega_j |z_i-z_j|$ .
- Normal: $\mathcal{L}_n = \sum_{u} (1-\hat N(u)\cdot N_{\mathrm{gt}}(u))$ .
- Semantic (if available): $\mathcal{L}_{\mathrm{seg}}$ as cross-entropy.
- Regularizers on surfel attributes or explicit geometric priors (Jiang et al., 26 Nov 2024, Pan et al., 1 Dec 2025, Xie et al., 14 Oct 2025).
Sensor fusion and uncertainty modeling:

Probabilistic approaches maintain state vectors $(p_i, n_i)$ and associated information matrices, updated via an information filter given RGB-D observations and explicit sensor noise models. The recursive update equations integrate multi-view consistency and sensor uncertainty directly into the surfel parameter estimation (Pan et al., 1 Dec 2025).

Surfel management:

Algorithms include surfel birth (spawning new surfels in regions of high data residual or transparency), death (removal of redundant or error-prone surfels), densification schedules, and attribute pruning based on learned utility scores (Jiang et al., 26 Nov 2024, Xie et al., 14 Oct 2025, Pan et al., 1 Dec 2025).

Latent representations:

For improved appearance and specular reflection modeling, surfels may carry a view-conditioned latent code, with color predicted by an MLP given both light and reflection directions, optionally encoded in spherical harmonics (Jiang et al., 26 Nov 2024, Jiang et al., 23 Sep 2025).

4. Applications: SLAM, 3D Reconstruction, and Multimodal Rendering

Differentiable Gaussian surfel mapping underpins a wide spectrum of geometry-centric tasks:

Simultaneous Localization and Mapping (SLAM):

SLAM systems such as S $^3$ LAM and EGG-Fusion utilize differentiable surfel splatting for accurate geometric maps and robust camera tracking. Analytical SE(3) Jacobians—including radial “toward-the-center” terms—enable effective gradient descent for pose and map estimation (Fan et al., 28 Jul 2025, Pan et al., 1 Dec 2025).

Surface reconstruction:

Fast, dense surface recovery with high geometric fidelity is achieved by optimizing surfel clouds to fit photometric and geometric cues, with methods reporting mean surface errors on the order of 0.6 cm on standard benchmarks and >20% relative accuracy improvements over prior Gaussian SLAM baselines (Pan et al., 1 Dec 2025).

Monocular and implicit surface modeling:

Methods such as MonoGSDF combine explicit surfel fields with neural signed distance fields (SDF), enabling watertight surface extraction (e.g., via Marching Cubes) and improved reconstruction from monocular image streams. Differentiable links between surfel locations and SDF values ensure end-to-end trainability of hybrid explicit–implicit pipeline components (Li et al., 25 Nov 2024).

Physically-based relighting and inverse rendering:

Frameworks incorporating radiosity-based global illumination use surfels as semi-opaque, SH-parameterized primitives. The forward light transport, including indirect lighting and non-Lambertian effects, is solved efficiently in SH coefficient space with analytic gradients for all surfel and material parameters, providing accurate geometry and relighting at interactive rates (Jiang et al., 23 Sep 2025).

Multimodal rendering and semantic mapping:

Pipelines such as UniGS implement unified, CUDA-accelerated differentiable surfel mapping with support for simultaneous photo-realistic RGB, depth, normal, and semantic rendering, all with analytic gradients and learnable, differentiable attribute pruning for computational efficiency (Xie et al., 14 Oct 2025).

5. Implementation Advances and Scalability

Modern differentiable Gaussian surfel mapping systems employ several computational strategies to ensure scalability and real-time performance:

Tile-based, fully parallel GPU rasterization supports efficient forward and backward passes through billions of pixels per second, with per-splat and per-tile scheduling (Xie et al., 14 Oct 2025, Pan et al., 1 Dec 2025).
Analytic and closed-form derivatives for all transformations—including ray–surface intersection, ellipsoid projection, and SH light transport—eliminate the need for slow or memory-intensive numerical differentiation or autograd tape (Jiang et al., 26 Nov 2024, Jiang et al., 23 Sep 2025).
Adaptive surface rendering strategies, blending between composite and dominant-surface assignments, reduce artifacts from depth uncertainty and overlapping surfels in under-constrained settings (Fan et al., 28 Jul 2025).
Hierarchical and learnable pruning techniques limit memory and latency penalties for redundant surfels, subject to thresholded utility metrics or explicit gradient factors (Xie et al., 14 Oct 2025).
Information-filter state management and batch-wise optimization facilitate real-time operation at frame rates exceeding 24 FPS for full SLAM+mapping loops on contemporary GPUs (Pan et al., 1 Dec 2025).

6. Comparative Performance and Quantitative Evaluation

Differentiable Gaussian surfel mapping achieves state-of-the-art or superior performance across various 3D vision benchmarks:

Method	Surface Error (e.g., Replica/ScanNet++)	SLAM ATE RMSE	RGB Novel-view Metrics	FPS
EGG-Fusion (Pan et al., 1 Dec 2025)	0.6 cm	0.17 cm	PSNR 25.7, SSIM 0.907	24 (RTX4090)
S $^3$ LAM (Fan et al., 28 Jul 2025)	—	—	—	real-time
Geometry Field GS (Jiang et al., 26 Nov 2024)	Chamfer 0.60 mm (DTU)	—	—	10 min training
UniGS (Xie et al., 14 Oct 2025)	—	—	—	170–200

Significant improvements are demonstrated in geometric accuracy (e.g., >20% over previous GS-SLAM), robustness on specular and multi-view datasets, and runtime scalability. Handling of reflective/specular surfaces and dense multimodal outputs is notably advanced by latent-based and spherical harmonics-encoded parameters (Jiang et al., 26 Nov 2024, Jiang et al., 23 Sep 2025).

7. Extensions: Specular Surfaces, Global Illumination, and Hybrid Representations

Recent research directions include:

Latent and SH-based appearance modeling for specular, non-Lambertian surfaces via MLPs conditioned on light and reflection directions (Jiang et al., 26 Nov 2024, Jiang et al., 23 Sep 2025).
Differentiable radiosity and global illumination using surfel-based, SH-coefficient light transport—enabling fast, view-independent relighting, physically plausible indirect lighting, and high geometric fidelity (Jiang et al., 23 Sep 2025).
Hybrid explicit–implicit pipelines, where explicit Gaussian surfels guide or inform implicit neural SDFs, offering mesh extraction and multi-resolution training for large or unbounded scenes (Li et al., 25 Nov 2024).

A plausible implication is that differentiable Gaussian surfel mapping provides a unifying abstraction bridging physically-based inverse graphics, learning-based reconstruction, and real-time mapping in robotics and computer vision.

References:

(Jiang et al., 26 Nov 2024) "Geometry Field Splatting with Gaussian Surfels" (Pan et al., 1 Dec 2025) "EGG-Fusion: Efficient 3D Reconstruction with Geometry-aware Gaussian Surfel on the Fly" (Fan et al., 28 Jul 2025) " $S^3$ LAM: Surfel Splatting SLAM for Geometrically Accurate Tracking and Mapping" (Jiang et al., 23 Sep 2025) "Differentiable Light Transport with Gaussian Surfels via Adapted Radiosity..." (Xie et al., 14 Oct 2025) "UniGS: Unified Geometry-Aware Gaussian Splatting for Multimodal Rendering" (Li et al., 25 Nov 2024) "MonoGSDF: Exploring Monocular Geometric Cues for Gaussian Splatting-Guided Implicit Surface Reconstruction"