Papers
Topics
Authors
Recent
Search
2000 character limit reached

Geometry-Aware Gaussian Surfel Fusion

Updated 4 December 2025
  • The paper introduces geometry-aware Gaussian surfel fusion, merging anisotropic 2D Gaussian surfels with probabilistic multi-view updates for photorealistic mapping and robust pose tracking.
  • It employs a detailed 2D surfel representation with explicit geometric regularization and uncertainty modeling, enabling adaptive rendering and precise multiview fusion.
  • The approach outperforms traditional point-based and voxel-based methods, achieving sub-millimeter accuracy and high FPS performance in rigorous experimental benchmarks.

Geometry-aware Gaussian surfel fusion is a class of methodologies that combine anisotropic Gaussian primitives flattened into surfels—planar disk-like patches aligned with local surface geometry—with multi-view, probabilistic, and differentiable fusion rules. This paradigm achieves photorealistic real-time mapping, robust pose tracking, and highly precise surface reconstruction. The current state-of-the-art approaches leverage both RGB-D and LiDAR sensors and employ learnable “2D Gaussian surfels,” adaptive rendering schemes, uncertainty-aware fusion, and explicit geometric regularization to address fundamental limitations of earlier point-based, voxel-based, and 3D Gaussian splatting schemes.

1. Mathematical Representation and Surfel Parameterization

Geometry-aware Gaussian surfel fusion adopts a surfel representation that is embedded within a 2D tangent plane but retains 3D geometric and appearance information. A surfel is characterized by:

  • Center position pkR3p_k \in \mathbb{R}^3
  • Two principal tangent directions tuk,tvkR3t_{u_k}, t_{v_k} \in \mathbb{R}^3
  • Scales suk,svk>0s_{u_k}, s_{v_k} > 0 along these axes
  • Normal twk=tuk×tvkt_{w_k} = t_{u_k} \times t_{v_k}; the rotation matrix Rk=[tuk,tvk,twk]R_k = [t_{u_k}, t_{v_k}, t_{w_k}]
  • Opacity (alpha) αk\alpha_k
  • Appearance coefficients ckc_k

The spatial probability distribution is modeled by a 2D Gaussian within the surfel plane, yielding a covariance Σk=diag(suk2,svk2)\Sigma_k = \mathrm{diag}(s_{u_k}^2, s_{v_k}^2), where a point on the surfel is Pk(u,v)=pk+suktuku+svktvkvP_k(u,v) = p_k + s_{u_k} t_{u_k} u + s_{v_k} t_{v_k} v, with (u,v)N(0,I2)(u,v) \sim \mathcal{N}(0, I_2). The rendering proceeds via front-to-back alpha compositing: Gk(u,v)=exp[12(u2+v2)]G_k(u,v) = \exp[-\frac{1}{2}(u^2 + v^2)]

ωk(x)=αkGk(u,v)j<k[1αjGj(u,v)]\omega_k(x) = \alpha_k G_k(u,v) \prod_{j<k}\left[1-\alpha_j G_j(u,v)\right]

where the color, depth, and normal at pixel xx are

C(x)=kωk(x)ckD(x)=kωk(x)zk/kωk(x)N(x)=kωk(x)twkC(x) = \sum_k \omega_k(x) c_k \quad D(x) = \sum_k \omega_k(x) z_k / \sum_k \omega_k(x) \quad N(x) = \sum_k \omega_k(x) t_{w_k}

as compiled in S3S^3LAM (Fan et al., 28 Jul 2025), GauS-SLAM (Su et al., 3 May 2025), and EGG-Fusion (Pan et al., 1 Dec 2025). Flattening the third axis yields a pure disk surfel (cf. (Dai et al., 2024)), effectively aligning the representation with the local surface.

2. Surfel Fusion, Uncertainty Modeling, and Optimization

Fusion of geometry and appearance evidence from multiple views is accomplished via adaptive, probabilistic update rules operating on all surfel parameters. The core optimization objective typically aggregates photometric (LrgbL_{\mathrm{rgb}}), depth, and normal consistency losses, sometimes with explicit geometric or statistical regularization: Lmap=CtCˉt1+γDDtDˉt1+γNx[1Nt(x)Nˉt(x)]L_\text{map} = \|C_t - \bar{C}_t\|_1 + \gamma_D \|D_t - \bar{D}_t\|_1 + \gamma_N \sum_x [1-N_t(x)\cdot\bar{N}_t(x)] Parameters (pk,tuk,tvk,suk,svk,ck,αk)(p_k, t_{u_k}, t_{v_k}, s_{u_k}, s_{v_k}, c_k, \alpha_k) are updated by gradient descent, fusing new RGB-D or LiDAR evidence. Redundant or unobserved surfels are pruned according to alpha coverage and error thresholds.

Uncertainty is handled by per-surfel covariance tracking, e.g., using information filters (Pan et al., 1 Dec 2025) that update the mean and covariance of each surfel's state xi=[pi;ni]x_i=[p_i; n_i] in information form (Λ,η)(\Lambda, \eta) via: Λt=Λt1+HTΛzHηt=ηt1+HTΛz(zttˉ)\Lambda^t = \Lambda^{t-1} + H^T \Lambda_z H \quad \eta^t = \eta^{t-1} + H^T \Lambda_z (z^t - \bar{t}) Surfel fusion also extends to pose-graph approaches, where surfel-to-surfel Mahalanobis constraints align patches across keyframes, driving global consistency to sub-pixel levels (Park et al., 31 Jul 2025).

3. Adaptive Surface Rendering and Multi-View Consistency

Adaptive rendering strategies address ambiguous or noisy regions, sharpening edges and increasing multi-view consistency. For example, S³LAM computes a depth-distortion measure: Dd(x)=i,jωi(x)ωj(x)zizj\mathcal{D}_d(x) = \sum_{i,j} \omega_i(x) \omega_j(x) |z_i - z_j| Exceeding a threshold triggers selection of a dominant surfel k=argmaxkωk(x)k^* = \arg\max_k \omega_k(x) for color and geometry (Fan et al., 28 Jul 2025, Su et al., 3 May 2025). Edge-aware depth blending, such as surface-aware depth adjustment in GauS-SLAM,

di=βidi+(1βi)dmβi=exp[(didm)2Bσi2]d_i' = \beta_i d_i + (1-\beta_i) d_m \quad \beta_i = \exp\left[-\frac{(d_i-d_m)^2}{B \sigma_i^2}\right]

suppresses occluded surfel bias, significantly improving geometry quality under novel viewpoints.

Multi-view fusion is reinforced by geometric regularization, monocular normal priors (from foundation models), and normal-depth consistency losses. Incorporation of strong monocular normal priors corrects ambiguous regions and stabilizes surfel alignment (Dai et al., 2024, Shen et al., 2024, Yang et al., 20 Aug 2025).

4. Advanced Fusion on Lie Groups and Covariance Control

When fusing pose and orientation uncertainties, Gaussian distributions on Lie groups (SE(3), SO(3)) are mapped into a common tangent space, using parallel transport and curvature corrections for optimal covariance adjustment (Ge et al., 2024). For surfel fusion in pose-graph SLAM, covariance transfer between reference frames leverages the Jacobian of the exponential map: Σ2=J(μ2)1J(μ1)Σ1J(μ1)TJ(μ2)T\Sigma_2 = J_{(\mu_2)}^{-1} J_{(\mu_1)} \Sigma_1 J_{(\mu_1)}^T J_{(\mu_2)}^{-T} Efficient approximations (parallel transport, curvature corrections) realize near-optimal accuracy with low computational overhead—enabling real-time fusion of position and orientation uncertainties for large surfel sets.

Covariance control is further enforced through scale-bounding using sigmoid constraints: $\sigma_\text{bounded} = \sigma_\min + (\sigma_\max-\sigma_\min) \mathrm{sigmoid}(s)$ preventing unconstrained Gaussian growth and yielding compact, crisp representations (Park et al., 31 Jul 2025).

5. Geometry-Aware Fusion in SLAM and Surface Reconstruction

State-of-the-art SLAM systems (S³LAM (Fan et al., 28 Jul 2025), GauS-SLAM (Su et al., 3 May 2025), EGG-Fusion (Pan et al., 1 Dec 2025), GSFusion (Park et al., 31 Jul 2025)) instantiate this fusion pipeline at scale for camera and LiDAR/IMU inputs. Incremental attachment, periodic surfel initialization, local–global map architectures, and fusion-aware bundle adjustment integrate RGB-D and LiDAR evidence into surfel maps. The surfel-centric approach supports sparse-to-dense real-time mapping (24 FPS in EGG-Fusion), robust tracking under severe occlusion, and millimeter-level geometric and pose accuracy.

Comparative results demonstrate that geometry-aware surfel fusion outperforms prior 3D Gaussian Splatting and neural volumetric schemes in surface completeness, normal alignment, tracking robustness, and memory efficiency. Quantitative benchmarks include Replica, ScanNet++, DTU, and Tanks-and-Temples with metrics such as Chamfer distance, normal consistency, PSNR, SSIM, and LPIPS.

6. Extensions: Radiance Field Rendering and Hybrid Architectures

Hybrid bi-scale architectures, such as Gaussian-enhanced Surfels (GES) (Ye et al., 24 Apr 2025), combine opaque 2D surfel layers for coarse geometry with sparse 3D Gaussians for high-frequency appearance. This approach enables sorting-free, ultra-fast rendering (675–1135 FPS) and modular extensions such as anti-aliasing (Mip-GES), storage compaction (Compact-GES), and improved geometry via 2D-GES. Sorting-free blending yields view-consistent images and suppresses “popping” artifacts, while surfel/Gaussian aggregation enables flexible surface smoothing.

Advanced inverse rendering methods further exploit surfel-based representations for material decomposition and photorealistic relighting, using physics-based shading (split-sum approximation), Monte Carlo sampling, and high-frequency specular compensation (Yang et al., 20 Aug 2025).

7. Practical Impact and Experimental Results

Recent systems demonstrate sub-millimeter surface reconstruction accuracy, robust geometric tracking, and real-time end-to-end operation. Tables below summarize key quantitative results from Replica and ScanNet++ (Pan et al., 1 Dec 2025, Fan et al., 28 Jul 2025, Su et al., 3 May 2025):

Method Acc (cm) Replica Comp (cm) ScanNet++ FPS PSNR (dB) Storage (MB)
EGG-Fusion 0.60 0.91 24 25.70
RTG-SLAM 0.80 1.22 15 24.77
S³LAM 0.47 8
3DGS 1.97 (DTU mm) 675 27.38 734
2D-GES 0.79 (DTU mm)
GES 1135 27.42 185

Qualitative results show sharp edge recovery, minimal color/depth artifacts, smooth surface meshes, and persistent tracking under severe occlusions, with surfel-based SLAM and rendering retaining geometric fidelity and visual consistency across difficult scenarios.

References

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Geometry-aware Gaussian Surfel Fusion.