Papers
Topics
Authors
Recent
Search
2000 character limit reached

VarSplat: Uncertainty-Aware 3D Gaussian SLAM

Updated 4 July 2026
  • VarSplat is an online dense RGB-D SLAM system that augments 3D Gaussian splatting with learned per-splat appearance variance for explicit uncertainty estimation.
  • It renders a differentiable per-pixel uncertainty map via the law of total variance to improve tracking, submap registration, and loop detection in challenging regions.
  • Experimental evaluations on Replica, TUM-RGBD, ScanNet, and ScanNet++ show state-of-the-art tracking accuracy and competitive reconstruction and rendering performance.

VarSplat is an uncertainty-aware 3D Gaussian Splatting system for online dense RGB-D SLAM. It augments each Gaussian with a learned per-splat appearance variance, renders a differentiable per-pixel uncertainty map by applying the law of total variance under alpha compositing, and uses that uncertainty to guide tracking, submap registration, and loop detection. The method is designed for failure regimes in which prior 3DGS-SLAM pipelines treat measurement reliability too implicitly, including low-texture regions, transparent surfaces, and scenes with complex reflectance, where local instability can accumulate into drift, ghosting, and unstable global alignment (Tran et al., 10 Mar 2026).

1. Problem setting and system scope

VarSplat targets online dense RGB-D SLAM built on 3D Gaussian Splatting. As in recent 3DGS-SLAM systems, it maintains a map of Gaussians and estimates camera poses by differentiably rendering color and depth from that map. Its central premise is that existing 3DGS-SLAM approaches optimize photometric and geometric residuals without explicitly modeling when rendered appearance is trustworthy, even though reliability varies sharply across the scene (Tran et al., 10 Mar 2026).

The motivating failure modes are concrete. In low-texture regions, photometric residuals become uninformative or noisy. At depth discontinuities and occlusion boundaries, small pose changes alter visibility and alpha weights, which destabilizes rendered color and depth. Transparent, specular, reflective, and glossy surfaces violate simple deterministic color assumptions. These local issues propagate into tracking drift, submap registration errors, ghosting, and loop-closure instability. Relative to earlier uncertainty-aware SLAM systems, VarSplat is distinguished by treating appearance uncertainty produced directly by the 3DGS rasterizer as a first-class quantity, rather than limiting uncertainty modeling to geometric variance or relying on pretrained uncertainty predictors (Tran et al., 10 Mar 2026).

The system is submap-based. Each submap PsP^s is a collection of Gaussians

Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},

where μi∈R3\mu_i \in \mathbb{R}^3 is the mean position, Σi∈R3×3\Sigma_i \in \mathbb{R}^{3\times 3} is the covariance, si∈R3s_i \in \mathbb{R}^3 is the scale, αi∈R\alpha_i \in \mathbb{R} is the opacity, ci∈R3c_i \in \mathbb{R}^3 is the color derived from spherical harmonics, and σi2∈R3\sigma_i^2 \in \mathbb{R}^3 is the learned per-splat appearance variance. This variance is per-splat and per-channel, and it models uncertainty around mean color rather than spatial extent. The paper notes that the notation σ2\sigma^2 is chosen to enforce positivity and follow conventional Gaussian-form uncertainty, but it does not clearly specify an explicit positivity-enforcing reparameterization or the initialization of σi2\sigma_i^2 (Tran et al., 10 Mar 2026).

2. Uncertainty formulation and rendered variance

VarSplat inherits standard 3DGS alpha compositing. With front-to-back depth ordering, the transmittance and weights are

Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},0

Rendered color and depth are

Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},1

where Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},2 is the camera-space depth of the projected Gaussian mean (Tran et al., 10 Mar 2026).

Its distinctive contribution is the uncertainty map derived from the law of total variance,

Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},3

In VarSplat, Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},4 is the pixel color and Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},5 indexes contributing splats. Conditioned on splat Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},6,

Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},7

This yields the rendered per-pixel variance map

Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},8

The decomposition has two terms. The within-component term,

Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},9

captures learned uncertainty of individual splats. The between-component term,

μi∈R3\mu_i \in \mathbb{R}^30

captures disagreement among overlapping splats. This makes the uncertainty map sensitive not only to intrinsically unreliable splats but also to ambiguous blending at occlusion boundaries, disocclusions, and reflective or transparent regions (Tran et al., 10 Mar 2026).

A practical property is that μi∈R3\mu_i \in \mathbb{R}^31 is rendered in the same alpha-compositing pass as color and depth. The rasterizer accumulates μi∈R3\mu_i \in \mathbb{R}^32, μi∈R3\mu_i \in \mathbb{R}^33, μi∈R3\mu_i \in \mathbb{R}^34, and μi∈R3\mu_i \in \mathbb{R}^35, then forms μi∈R3\mu_i \in \mathbb{R}^36 by subtracting μi∈R3\mu_i \in \mathbb{R}^37. This preserves single-pass rasterization efficiency and differentiability, and avoids Monte Carlo sampling or a separate uncertainty network (Tran et al., 10 Mar 2026).

3. Use of uncertainty in tracking, registration, and loop detection

VarSplat converts variance into confidence weights by median-centered log scaling: μi∈R3\mu_i \in \mathbb{R}^38

μi∈R3\mu_i \in \mathbb{R}^39

Here Σi∈R3×3\Sigma_i \in \mathbb{R}^{3\times 3}0 is per-pixel confidence derived from rendered Σi∈R3×3\Sigma_i \in \mathbb{R}^{3\times 3}1, and Σi∈R3×3\Sigma_i \in \mathbb{R}^{3\times 3}2 is per-splat confidence derived from learned Σi∈R3×3\Sigma_i \in \mathbb{R}^{3\times 3}3. Larger-than-median variance yields smaller weight; smaller-than-median variance yields larger weight (Tran et al., 10 Mar 2026).

In tracking, the current pose is estimated relative to the active submap using rendered Σi∈R3×3\Sigma_i \in \mathbb{R}^{3\times 3}4, Σi∈R3×3\Sigma_i \in \mathbb{R}^{3\times 3}5, and Σi∈R3×3\Sigma_i \in \mathbb{R}^{3\times 3}6. The intended tracking objective weights photometric residuals by Σi∈R3×3\Sigma_i \in \mathbb{R}^{3\times 3}7 while leaving depth residuals unweighted, with Σi∈R3×3\Sigma_i \in \mathbb{R}^{3\times 3}8 balancing color and depth. This design reflects the paper’s observation that RGB residuals are especially unstable under viewpoint change, low texture, and occlusion. Tracking further uses an inlier mask Σi∈R3×3\Sigma_i \in \mathbb{R}^{3\times 3}9 that removes pixels whose depth error exceeds si∈R3s_i \in \mathbb{R}^30 the median depth error in the current frame and removes pixels with invalid depth. A soft alpha mask si∈R3s_i \in \mathbb{R}^31 is also used. During tracking, variance is frozen and gradients are stopped through si∈R3s_i \in \mathbb{R}^32, so pose optimization does not interfere with variance learning (Tran et al., 10 Mar 2026).

Registration after loop detection uses the same uncertainty principle: photometric residuals are weighted by si∈R3s_i \in \mathbb{R}^33, depth residuals are left unweighted, and variance is fixed during registration. The paper attributes improved medium-range alignment and reduced ghosting between overlapping submaps to this weighting strategy (Tran et al., 10 Mar 2026).

Loop detection operates at submap level and uses per-splat variance rather than per-pixel variance. Following LoopSplat, keyframe descriptors provide candidate matches, and similarity is modulated by a reliability score

si∈R3s_i \in \mathbb{R}^34

Submaps supported mainly by high-variance splats receive smaller reliability factors, which reduces their influence in loop matching. The paper states that this reduces false closures on repeated structure and improves long-range consistency. Supplementary details specify NetVLAD with VGG16-NetVLAD-Pitts30K weights from HLoc, followed by overlap-ratio filtering from front-end poses (Tran et al., 10 Mar 2026).

4. Optimization, map management, and implementation

VarSplat jointly optimizes camera poses, Gaussian parameters, and si∈R3s_i \in \mathbb{R}^35 during mapping. The intended mapping objective is

si∈R3s_i \in \mathbb{R}^36

The color term is the standard 3DGS combination of si∈R3s_i \in \mathbb{R}^37 and SSIM,

si∈R3s_i \in \mathbb{R}^38

the depth term is

si∈R3s_i \in \mathbb{R}^39

and αi∈R\alpha_i \in \mathbb{R}0 regularizes Gaussian scales, similar to GS-SLAM, although the precise variable definitions for that term are not fully clear from the paper text (Tran et al., 10 Mar 2026).

Variance learning uses a Gaussian negative-log-likelihood-style term,

αi∈R\alpha_i \in \mathbb{R}1

This is deliberately based on squared αi∈R\alpha_i \in \mathbb{R}2 residuals rather than αi∈R\alpha_i \in \mathbb{R}3, because the paper treats αi∈R\alpha_i \in \mathbb{R}4 as a Gaussian variance. The derivative with respect to rendered variance is

αi∈R\alpha_i \in \mathbb{R}5

and by chain rule

αi∈R\alpha_i \in \mathbb{R}6

Thus each splat’s variance is updated in proportion to its compositing weight. Mapping learns poses, Gaussian geometry and appearance, and αi∈R\alpha_i \in \mathbb{R}7 jointly; tracking and registration freeze variance; loop closure does not propagate gradients into variance because it occurs after submap construction (Tran et al., 10 Mar 2026).

The submap-based pipeline initializes Gaussians by backprojecting RGB-D points from the first keyframe, adds Gaussians in unobserved regions or merges overlaps, and starts a new submap when camera motion exceeds a spatial threshold from the current submap centroid or accumulated tracking uncertainty passes a preset limit. Supplementary settings specify αi∈R\alpha_i \in \mathbb{R}8 m and αi∈R\alpha_i \in \mathbb{R}9, with alternative fixed-frame heuristics for ScanNet and ScanNet++. New Gaussians are initialized with opacity ci∈R3c_i \in \mathbb{R}^30 and scales from nearest neighbor. Pruning thresholds are ci∈R3c_i \in \mathbb{R}^31 for Replica and ci∈R3c_i \in \mathbb{R}^32 for the other datasets. On ScanNet++, if the tracking loss exceeds ci∈R3c_i \in \mathbb{R}^33 the running average, the pose is reinitialized with ICP odometry (Tran et al., 10 Mar 2026).

The reported implementation uses Python 3.10, PyTorch 2.4.1, CUDA 12.6, and NVIDIA A100 80GB. The original 3DGS rasterizer and a depth-rendering extension are modified to propagate variance. Default mapping weights are ci∈R3c_i \in \mathbb{R}^34, ci∈R3c_i \in \mathbb{R}^35, ci∈R3c_i \in \mathbb{R}^36, and ci∈R3c_i \in \mathbb{R}^37. Dataset-specific tracking hyperparameters include ci∈R3c_i \in \mathbb{R}^38, ci∈R3c_i \in \mathbb{R}^39, σi2∈R3\sigma_i^2 \in \mathbb{R}^30, σi2∈R3\sigma_i^2 \in \mathbb{R}^31, σi2∈R3\sigma_i^2 \in \mathbb{R}^32, and σi2∈R3\sigma_i^2 \in \mathbb{R}^33, with, for example, σi2∈R3\sigma_i^2 \in \mathbb{R}^34 on Replica, σi2∈R3\sigma_i^2 \in \mathbb{R}^35 on TUM-RGBD, σi2∈R3\sigma_i^2 \in \mathbb{R}^36 on ScanNet, and σi2∈R3\sigma_i^2 \in \mathbb{R}^37 on ScanNet++ (Tran et al., 10 Mar 2026).

5. Experimental evaluation

VarSplat is evaluated on Replica, TUM-RGBD, ScanNet, and ScanNet++. Tracking is measured with ATE RMSE on keyframes; reconstruction with depth σi2∈R3\sigma_i^2 \in \mathbb{R}^38 and mesh σi2∈R3\sigma_i^2 \in \mathbb{R}^39; rendering with PSNR, SSIM, and LPIPS; and ScanNet++ also reports novel-view synthesis PSNR. Baselines include SplaTAM, MonoGS, Gaussian-SLAM, LoopSplat, CG-SLAM, and Uni-SLAM (Tran et al., 10 Mar 2026).

The strongest tracking results appear on real-world datasets. On Replica, VarSplat reports the best average tracking accuracy among the compared methods with σ2\sigma^20 cm, versus σ2\sigma^21 for LoopSplat, σ2\sigma^22 for CG-SLAM, and σ2\sigma^23 for Gaussian-SLAM. On ScanNet++, it reports an average ATE RMSE of σ2\sigma^24 cm, compared with σ2\sigma^25 for LoopSplat and σ2\sigma^26 for Gaussian-SLAM; the paper explicitly states that this is about σ2\sigma^27 better than the second-best method and emphasizes robustness on large-motion real-world sequences. On TUM-RGBD, the average is σ2\sigma^28, compared with σ2\sigma^29 for LoopSplat, σi2\sigma_i^20 for CG-SLAM, and σi2\sigma_i^21 for MonoGS. On ScanNet, the average is σi2\sigma_i^22, compared with σi2\sigma_i^23 for Uni-SLAM, σi2\sigma_i^24 for GO-SLAM, σi2\sigma_i^25 for LoopSplat, and σi2\sigma_i^26 for CG-SLAM (Tran et al., 10 Mar 2026).

Reconstruction and rendering remain competitive. On Replica, depth σi2\sigma_i^27 is σi2\sigma_i^28 versus σi2\sigma_i^29 for LoopSplat, while mesh Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},00 is Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},01 versus Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},02. The paper uses this to argue that uncertainty-aware weighting improves pose estimation without degrading mesh quality. Average input-view rendering scores are Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},03 PSNR / Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},04 SSIM / Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},05 LPIPS on Replica, Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},06 on TUM-RGBD, and Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},07 on ScanNet. On ScanNet++ novel view synthesis, VarSplat reports Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},08 PSNR, compared with Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},09 for LoopSplat and Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},10 for Gaussian-SLAM (Tran et al., 10 Mar 2026).

Ablation results indicate that uncertainty contributes across the full SLAM stack. On ScanNet, removing uncertainty entirely yields Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},11 ATE RMSE; using it only in tracking yields Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},12; tracking plus loop yields Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},13; loop plus registration yields Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},14; and the full system using uncertainty in tracking, loop detection, and registration yields Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},15. A second ablation reports that the best variant freezes variance during tracking, includes the depth residual in variance training, and uses squared Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},16 in the NLL term; removing any of these choices degrades performance. Runtime measurements on Replica/Room0 on A100 report mapping at Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},17 s/frame and Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},18 ms/iter, tracking at Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},19 s/frame and Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},20 ms/iter, and ATE Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},21, versus LoopSplat’s Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},22 s/frame mapping, Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},23 s/frame tracking, and ATE Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},24. The paper presents the method as online rather than as strictly real-time in the conventional sense (Tran et al., 10 Mar 2026).

6. Relation to the broader splat literature and limitations

VarSplat occupies a specific niche within the Gaussian-splatting literature: uncertainty-aware online RGB-D SLAM. It is neither a physics-based appearance model nor a VR-oriented renderer, nor an interpretability framework. AstroSplat, for example, replaces the usual spherical-harmonic appearance computation with planetary reflectance models for rendering and reconstruction of small celestial bodies (Nolan et al., 12 Mar 2026). VRSplat targets virtual reality by combining Mini-Splatting, StopThePop, and Optimal Projection, together with a single-pass foveated rasterizer (Tu et al., 15 May 2025). XSPLAIN addresses ante-hoc interpretability for splat-based classification rather than SLAM, using prototype-based explanations over 3D Gaussian primitives (Galus et al., 10 Feb 2026). Splat-LOAM is LiDAR-native and geometry-first, using 2D Gaussian surface splats and spherical rasterization for LiDAR odometry and mapping (Giacomini et al., 21 Mar 2025). A plausible implication is that VarSplat should be read not as a generic reformulation of 3DGS, but as a renderer-level uncertainty extension specialized to RGB-D pose estimation and submap alignment.

The current limitations are explicit. The system still relies on depth-based Gaussian insertion, so performance is constrained when depth is sparse or missing. It models appearance uncertainty only, not a full joint appearance-and-geometry uncertainty. Learning and rendering variance adds computation and memory overhead. Experiments focus on mostly static scenes. Several implementation details remain underspecified in the paper text, notably the exact positivity parameterization for Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},25, the initialization of variance, and the formulation of global refinement after submap merging (Tran et al., 10 Mar 2026).

These caveats delimit the method’s scope. The principal technical novelty is the learned per-splat appearance variance together with the rendered uncertainty map

Ps={Gis(μi,Σi,αi,si,ci,σi2)∣i=1,…,Ns},P^s = \{G_i^s(\mu_i,\Sigma_i,\alpha_i,s_i,c_i,\sigma_i^2)\vert i = 1, \ldots, N^s \},26

which is then used coherently in tracking, registration, and loop detection. Within that scope, VarSplat provides a concrete formulation of how uncertainty can be made native to the 3DGS rasterizer rather than appended as an external predictor or reduced to depth variance alone (Tran et al., 10 Mar 2026).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to VarSplat.