Triangle Splatting SLAM

Updated 7 June 2026

The paper introduces a dynamic 'triangle soup' representation that uses differentiable rendering for efficient and photorealistic RGB-D SLAM.
It employs restricted Delaunay triangulation to extract an editable, connected mesh on-the-fly, supporting live deformation and collision checking.
Experimental results demonstrate competitive camera tracking and superior geometry accuracy on standard benchmarks, validating its practical performance.

Triangle Splatting SLAM is a dense RGB-D simultaneous localization and mapping (SLAM) system that leverages differentiable triangles as explicit 3D map primitives. Its core innovation is the use of a dynamic “triangle soup” representation optimized online, providing both photorealistic rendering and explicit geometry amenable to downstream tasks such as simulation and mesh editing. Triangle Splatting SLAM employs online differentiable rendering of this triangle soup for both camera tracking and map optimization, and can extract a connected mesh on-the-fly via restricted Delaunay triangulation, supporting live mesh deformation and collision checking. Experimental results demonstrate state-of-the-art geometry accuracy and competitive camera-tracking performance on standard benchmarks (Fry et al., 29 May 2026).

1. System Pipeline and Map Representation

The system maintains a live triangle soup map $M=(V, F)$ , where $V$ is the set of 3D vertices and $F$ the connectivity of the triangles. The pipeline operates in a single process, executing three tightly interleaved stages per RGB-D frame:

Tracking: Estimation of the 6-DOF camera pose $T_{CW}$ via minimization of a tracking energy.
Keyframing: Keyframe selection based on pose change and triangle visibility; addition of keyframes with new triangles back-projected from depth.
Mapping: Joint optimization of past keyframe poses and triangle parameters; densification and pruning of triangles; optional mesh extraction.

The pseudocode for the end-to-end, single-threaded pipeline is as follows:

$T_{CW}$ 6

The mapping stage periodically extracts a mesh using restricted Delaunay triangulation, converting the “soup” into a connected surface suitable for simulation or editing.

2. Differentiable Triangle Splatting

Each triangle $F_m = (i,j,k)$ is stored with three world-space vertices $v_i = (x_i, y_i, z_i, c_i, o_i)$ , where $c_i$ is color and $o_i$ opacity. Differentiable triangle splatting comprises:

Projection into image space: $v_I = \pi(T_{CW}v_W)$ .
Signed-Distance Field: The image-space signed distance function $\phi(p)$ is computed relative to projected triangle edges, with a smooth per-pixel coverage function $V$ 0, where $V$ 1 is the triangle incentre.
Alpha-composite Rendering: Pixel color $V$ 2 is rendered using an alpha compositing stream over triangles,

$V$ 3

Photometric Loss: The photometric error over all pixels,

$V$ 4

Backpropagation: Gradients are computed for vertex positions and appearance, leveraging analytic derivatives (Eq. 4 and window function Eq. 3), the pinhole model, and pose Jacobians in $V$ 5 (Eq. 7).

This differentiable pipeline enables gradient-based optimization of geometry and color parameters directly from image and depth supervision.

3. Camera Tracking with Photometric and Depth Alignment

Camera tracking solves for $V$ 6 per frame by minimizing a joint energy:

$V$ 7

where

$V$ 8 (Eq. 11) is the combined photometric/structural loss,
$V$ 9 (Eq. 12) aligns rendered and observed depths,
$F$ 0 and $F$ 1 are tunable hyperparameters.

Approximately 100 gradient-descent steps are performed per frame, using analytic pose Jacobians for efficiency.

4. Online Mapping, Densification, and Optimization

Mapping proceeds whenever a new keyframe is added. New triangles are back-projected from depth and assigned spatial support and normals via sensor data (Eq. 14). Optimization is performed by minimizing the mapping energy over vertices, colors, opacities, and past keyframe poses:

$F$ 2

where:

$F$ 3 (Eq. 15) penalizes normal misalignments,
$F$ 4 (Eq. 16) encourages triangle equilateralness.

Optimization uses Adam per-parameter learning rates: positions $F$ 5, colors $F$ 6, and poses $F$ 7. Densification (blur-split, Loop subdivision) and pruning (opacity and area-based) maintain map quality and efficiency.

5. On-the-Fly Mesh Extraction with Restricted Delaunay

To convert the triangle soup into a manifold mesh, restricted Delaunay triangulation is applied:

Vertices with mean opacity $F$ 8 are selected.
Delaunay tetrahedralisation is constructed in 3D. Only surface faces separating inside/outside are retained.
Triangles exceeding the projected area threshold or fully occluded from all keyframes are pruned.

Incremental mesh updating is supported:

$T_{CW}$ 7

This allows efficient online mesh extraction and supports real-time mesh-based editing, deformation, and collision checking.

6. Implementation, Hyperparameters, and System Characteristics

The implementation utilizes a custom CUDA/C++ differentiable rasterizer for triangle splatting, with the SLAM loop managed in PyTorch. Hardware used includes an NVIDIA RTX 4090 GPU and AMD Ryzen 9 9950X CPU. Operational metrics are:

Frame time: 430–1225 ms (0.8–2.3 FPS on TUM-RGBD).
Map sizes: 24k–152k triangles; 4.4–16.4 MB checkpoint size; 0.5–1.25 GB GPU memory.
Hyperparameters (Replica benchmarks): 100 tracking iterations per frame; learning rates for rotation $F$ 9, translation $T_{CW}$ 0, mapping features and vertices $T_{CW}$ 1; loss weights $T_{CW}$ 2, $T_{CW}$ 3, $T_{CW}$ 4, $T_{CW}$ 5; keyframing at 5-frame intervals, translation threshold 0.08 m, overlap 0.95; densification and pruning thresholds as specified.

7. Evaluation and Comparative Results

Triangle Splatting SLAM achieves:

Camera tracking (TUM-RGBD dataset; absolute trajectory error, cm):

Method	fr1/desk	fr2/xyz	fr3/office	Avg
MonoGS-2D	1.58	1.20	1.83	1.54
Ours	1.77	1.12	1.83	1.57

3D geometry (Replica; Chamfer distance in cm, L1 depth in cm):

Method	Chamfer Avg ↓	Depth L1 Avg ↓
MonoGS-2D* + TSDF	1.36	0.74
Ours + TSDF	0.95	0.68
Ours + Delaunay (pruned)	1.14	—

Mesh extraction time (Replica; seconds, avg):

Method	Time [s] Avg ↓
Ours + TSDF	33.44
Ours + Delaunay	11.18
Ours + Delaunay (pruned)	15.66

The method provides live mapping with explicit, editable mesh geometry, supporting photorealistic novel-view rendering and mesh-based downstream tasks, while achieving state-of-the-art 3D geometric accuracy and camera-tracking comparable to established SLAM systems (Fry et al., 29 May 2026).

Markdown Report Issue Upgrade to Chat

References (1)

Triangle Splatting SLAM (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Triangle Splatting SLAM.

Triangle Splatting SLAM

1. System Pipeline and Map Representation

2. Differentiable Triangle Splatting

3. Camera Tracking with Photometric and Depth Alignment

4. Online Mapping, Densification, and Optimization

5. On-the-Fly Mesh Extraction with Restricted Delaunay

6. Implementation, Hyperparameters, and System Characteristics

7. Evaluation and Comparative Results

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Triangle Splatting SLAM

1. System Pipeline and Map Representation

2. Differentiable Triangle Splatting

3. Camera Tracking with Photometric and Depth Alignment

4. Online Mapping, Densification, and Optimization

5. On-the-Fly Mesh Extraction with Restricted Delaunay

6. Implementation, Hyperparameters, and System Characteristics

7. Evaluation and Comparative Results

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research