Floor-Aware Gaussian Splatting

Updated 29 October 2025

The paper demonstrates that floor-aware Gaussian splatting integrates geometric and semantic cues to improve scene consistency and rendering quality.
Methodologies combine global tri-plane features with localized grid contexts to accurately capture extensive floor regions and mitigate occlusion artifacts.
Practical enhancements include measurable gains in PSNR, SSIM, and mesh F1, ensuring coherent and detailed reconstructions in large-scale 3D scenes.

Floor-aware Gaussian splatting refers to a class of 3D scene representation and reconstruction techniques that incorporate specialized geometric, semantic, or contextual handling for floor regions or large-scale planar surfaces. These methods leverage the statistical and spatial properties of the floor and related structures to improve scene consistency, occupancy reasoning, and detail preservation. Contemporary approaches, as evidenced in SplatCo (Xiao et al., 23 May 2025), AutoOcc (Zhou et al., 7 Feb 2025), and SurfaceSplat (Gao et al., 21 Jul 2025), embed floor awareness at multiple levels—through global context modeling, semantic-guided splat placement, and geometry-constrained splatting pipelines.

1. Foundational Principles: Gaussian Splatting and Scene Layout

Gaussian splatting is a rendering paradigm in 3D scene representation that models surfaces and volumes using discrete, spatially distributed anisotropic 3D Gaussian functions. Each Gaussian encodes geometric position, covariance (shape and orientation), color, opacity, and possibly semantic or temporal information. Rendering is achieved via rasterization or accumulation of projected Gaussians, enabling high-fidelity novel view synthesis and flexible occupancy modeling.

In large-scale unbounded scenes, floor regions (ground planes, horizontal surfaces) present unique challenges:

Dominance in spatial layout, affecting global scene geometry.
Occlusion boundaries and visibility constraints, relevant for multi-view consistency.
Semantic importance for occupancy and navigation tasks.

Floor-aware extensions introduce specific mechanisms to account for these properties, yielding improvements in reconstruction fidelity and semantic reasoning.

2. Structural-Context Collaboration and Hierarchical Feature Fusion

Approaches such as SplatCo (Xiao et al., 23 May 2025) implement structure-aware splatting via collaborative global-local representations. Floored regions and large planar surfaces are captured using:

Global tri-plane features: Three orthogonal 2D feature planes (xy, xz, yz) encode broad spatial context, including the ground and extended floor layouts.
Local context grid features: Fine surface details (including floor textures) are modeled with grid-based interpolation anchored to local neighborhoods.

For a 3D point $\mathbf{p}$ , the hierarchical feature is constructed as

$f_h = \sum_{i \in L} [f_t^i, f_c^i]$

where $f_t$ arises from tri-plane (floor-informative) projections and $f_c$ aggregates local detail. Feature fusion occurs through a multi-level (coarse-to-fine) hierarchy, ensuring floor regions benefit both from holistic scene guidance and localized refinement. This mitigates misalignment artifacts at large planar boundaries and preserves floor-related surface detail.

3. Semantic and Geometric-Aware Floor Splatting

AutoOcc (OccGS) (Zhou et al., 7 Feb 2025) introduces semantic-aware Gaussians (SGAG), extending classical Gaussian splatting with scene-level semantics and dynamic object reasoning:

Floor-awareness is achieved via LiDAR-derived geometric cues, which often inherently define ground plane/floor surfaces. Gaussian placement and clustering adapt to floor geometry, ensuring robust foreground/background decoupling and low-cost representation of spatially extended regions.
Each SGAG encodes:
- Center $o$ (frequently coinciding with floor anchors)
- Covariance $\Sigma$ (planar orientation for floors)
- Semantic probability vector $\gamma$ (high probability for floor/ground classes)

The cumulative Gaussian-to-voxel splatting algorithm transforms sparse SGAGs into dense occupancy maps, where floor voxels are adaptively assigned based on splat coverage and semantic cues: $\digamma(o) = \sum_{i=1}^{N} d_i G(x_i) \alpha_i \mathrm{softmax}(\gamma_i)$ Semantic granularity for floors is controlled by adjusting Gaussian scale and density, yielding efficient and scene-inclusive reconstructions.

4. Geometry-Constrained Initialization and Cleaning

SurfaceSplat (Gao et al., 21 Jul 2025) demonstrates that reliably reconstructing floor and large planar regions benefits from hybrid pipelines combining SDF-based global geometry estimation with 3DGS local detail preservation:

SDF-based surface meshes are extracted and cleaned (connected-component filtering), promoting consistent floor topology and suppressing fragmentation due to sparse view sampling.
Surface point sampling prioritizes points visible from input views, which systematically privileges the floor due to its typical omnipresence.
3DGS initialization then inherits this cleaned geometry, ensuring floor splats accurately reflect scene-level layout.

A plausible implication is that initializing Gaussian splats from SDF-derived meshes produces geometry-aware and floor-coherent 3DGS distributions, reducing typical weaknesses of fragmented or discontinuous floor modeling in pure splatting approaches.

5. Cross-View Consistency, Densification, and Pruning

Floor-aware splatting pipelines enforce multi-view consistency and optimal density around floor regions through:

Multi-view gradient synchronization (SplatCo): All Gaussian attributes (including those associated with floor) are jointly updated across multiple perspectives, stabilizing planar boundaries and avoiding view-dependent floor artifacts.
Visibility-aware densification: Camera layout and inter-view distances dictate adaptive splat density. For regions of the floor observed from widely separated views (low overlap), densification thresholds $\hat{\beta}$ are increased to ensure complete coverage:

$\hat{\beta} = \frac{\beta}{2} H\left(\frac{r_i}{\tau} - 1\right) + \beta \left(1 - H\left(\frac{r_i}{\tau} - 1\right)\right)$

Structural-consistency-guided pruning: Gaussians are pruned based on proximity to the camera, mean position, and cross-view SSIM metrics, eliminating redundant floor splats and suppressing artifacts at floor-camera proximity.

These mechanisms result in floor regions that are both well-represented and resistant to occlusion-driven or overspecialized splat redundancy.

6. Quantitative and Qualitative Impact

The empirical impact of floor-aware Gaussian splatting is visible across large-scale scene rendering and semantic occupancy benchmarks:

SplatCo (Xiao et al., 23 May 2025) achieves PSNR improvements of 1–2 dB and SSIM gains of 0.1–0.2 over previous methods, with the floor and ground structures rendered sharply and with minimal fragmentation.
AutoOcc (Zhou et al., 7 Feb 2025) attains mIoU of 30.26 (Occ3D-nuScenes) and sustains high performance under cross-dataset zero-shot evaluation, reliably reconstructing floor geometry and occupancy in diverse adversarial scenarios.
SurfaceSplat (Gao et al., 21 Jul 2025) delivers significant improvements in mesh F1 (up to 86.7), CD (down to 9.7 mm), and PSNR-F (up to 20.45), with floor regions consistently exhibiting polygonal and photometric coherence.

Qualitative reconstructions display clean floor surfaces, persistent across input sparsity and multi-view occlusion, and encompassing detailed texture and structural fidelity.

7. Future Implications and Modular Extensions

The modular structure of floor-aware Gaussian splatting—via feature hierarchy, semantic-geometric integration, and hybrid pipelines—suggests extensibility to broader scene-scale constraints:

More explicit floor segmentation or labeling as a regularization term within multi-view splat optimization.
Dynamic floor occupancy modeling using temporal tokens in SGAG, enabling robust navigation and obstacle reasoning, especially in autonomous systems.
Integration with layout consistency priors (e.g., Manhattan World assumption), refining spatial regularity and further enhancing floor coherence in dense or highly-structured environments.

These directions offer pathways for incorporating architectural, urban, or interior scene priors into future scene-level Gaussian splatting frameworks.

Floor-aware Gaussian splatting unifies geometric, semantic, and contextual reasoning to yield robust, detail-preserving, and globally coherent 3D scene reconstructions. By leveraging scene layout priors, vision-language semantic guidance, and hybrid mesh-splat initialization, these approaches systematically improve the fidelity, consistency, and efficiency of rendering and occupancy annotation, particularly in challenging multi-view, large-scale, or zero-shot deployment scenarios.