Neural Texture Splatting: Expressive 3D Gaussian Splatting for View Synthesis, Geometry, and Dynamic Reconstruction (2511.18873v1)

Published 24 Nov 2025 in cs.CV and cs.GR

Abstract: 3D Gaussian Splatting (3DGS) has emerged as a leading approach for high-quality novel view synthesis, with numerous variants extending its applicability to a broad spectrum of 3D and 4D scene reconstruction tasks. Despite its success, the representational capacity of 3DGS remains limited by the use of 3D Gaussian kernels to model local variations. Recent works have proposed to augment 3DGS with additional per-primitive capacity, such as per-splat textures, to enhance its expressiveness. However, these per-splat texture approaches primarily target dense novel view synthesis with a reduced number of Gaussian primitives, and their effectiveness tends to diminish when applied to more general reconstruction scenarios. In this paper, we aim to achieve concrete performance improvement over state-of-the-art 3DGS variants across a wide range of reconstruction tasks, including novel view synthesis, geometry and dynamic reconstruction, under both sparse and dense input settings. To this end, we introduce Neural Texture Splatting (NTS). At the core of our approach is a global neural field (represented as a hybrid of a tri-plane and a neural decoder) that predicts local appearance and geometric fields for each primitive. By leveraging this shared global representation that models local texture fields across primitives, we significantly reduce model size and facilitate efficient global information exchange, demonstrating strong generalization across tasks. Furthermore, our neural modeling of local texture fields introduces expressive view- and time-dependent effects, a critical aspect that existing methods fail to account for. Extensive experiments show that Neural Texture Splatting consistently improves models and achieves state-of-the-art results across multiple benchmarks.

Summary

The paper introduces Neural Texture Splatting, integrating local RGBA texture fields with a global neural field to enhance the expressiveness of 3D Gaussian Splatting.
It demonstrates improved view synthesis, geometry modeling, and dynamic reconstruction with state-of-the-art metrics, including up to a +3 dB PSNR gain on dynamic scenes.
The method mitigates overfitting and redundancy by using canonical polyadic decomposition and L1 regularization, efficiently compressing local textures while ensuring global consistency.

Neural Texture Splatting: Enhancing 3D Gaussian Splatting for Robust View Synthesis, Geometry, and Dynamic Reconstruction

Introduction and Motivation

3D Gaussian Splatting (3DGS) has become a dominant paradigm for explicit point-based radiance field rendering, yielding high-quality, real-time visualizations and facilitating extensions to surface modeling, sparse input reconstruction, and 4D dynamics. However, the primitive-level expressiveness in vanilla 3DGS is constrained by the limitations of isotropic Gaussian kernels. Augmenting local capacity—especially through per-splat textures—has recently increased scene fidelity for novel-view synthesis but presents significant challenges: view/time independence, redundancy, and overfitting, especially with dense training regimes and reduced primitive counts.

This work introduces Neural Texture Splatting (NTS), which unifies and significantly extends 3DGS methods by encoding each splat with a local RGBA texture field while regularizing its structure via a global neural field. This architecture enables compact, shared modeling of splat appearance and geometric variations, explicitly conditioning on view and time, and yields strong generalization under both sparse and dense input conditions.

Methodology

NTS associates each 3D Gaussian primitive with a local RGBA tri-plane texture field, capturing high-frequency local appearance and geometry. The rendering pipeline first computes the ray-primitive intersection (using techniques such as in [Yu2024GOF]), then queries the corresponding local texture fields to modulate color and opacity. The rendering integral is extended to include these texture contributions, yielding improved local expressiveness.

Figure 1: Overview of Neural Texture Splatting, showing integration of local textured tri-planes with global neural tri-plane encoding for each splat and the resultant compositional rendering pipeline.

Critically, the risks of overfitting and unsatisfactory spatial consistency in per-primitive textured splatting are mitigated by a global neural field, structured as a hybrid of global tri-plane features and a shallow neural decoder. Splats query this shared global field at their center positions and with additional conditioning on view direction and timestep, which produces locally parameterized RGBA tri-plane textures. To maximize efficiency, Canonical Polyadic decomposition is employed for plane decoding, and regularizations including L1 sparsity are introduced.

Empirical Results

Sparse-View and Dynamic Reconstruction

NTS is evaluated against state-of-the-art methods for both static and dynamic sparse-view benchmarks (Blender, Owlii), outperforming the previous best (SplatFields) in metrics such as PSNR, SSIM, and LPIPS. Quantitative improvements (~3 dB PSNR uplift on dynamic 4-view Owlii scenes) highlight the role of neural conditioning in capturing time-dependent appearance, which traditional per-primitive texture models cannot achieve.

Figure 2: Qualitative comparison on MipNeRF360; the method preserves high-frequency, view-dependent effects and fine structures missed by baselines.

Extensive ablations demonstrate NTS’s superiority over naive textured splatting and alternative neural decoders (e.g. direct RGB or SH prediction), corroborating the architectural choices. These gains persist with variable input density and across scene types.

Dense-View Synthesis and Surface Reconstruction

NTS is integrated with GOF and 3DGS-MCMC for dense-view scene synthesis. It yields measurable improvement in PSNR (up to +0.65 dB over GOF on Blender), SSIM, and mesh quality. Surface extraction on DTU gains enhanced accuracy and reduced Chamfer Distance compared to the original GOF, confirming that NTS’s high-frequency modeling does not compromise geometric regularity.

(Figure 3)

Figure 3: Improved surface reconstruction on DTU: NTS-driven models generate smoother, more complete meshes across scenes.

Computational and Storage Analysis

NTS maintains reasonable model and storage footprints by using global neural priors to efficiently compress local textures. While training and rendering have increased computational cost (primarily for network evaluation and ray-Gaussian intersections), this is offset by substantial boosts in reconstruction quality. Compared to other textured-splatting approaches, NTS achieves higher PSNR and lower storage usage.

Implications and Future Directions

From a practical perspective, NTS offers a plug-and-play upgrade for existing 3DGS-based pipelines across view synthesis, geometry modeling, and dynamic reconstruction, especially where input sparsity and temporal/ocular coherence are critical. Its neural texture encoding unlocks expressive, consistent modeling of view- and time-dependent phenomena, reducing reliance on dense sampling and minimizing artifact production (e.g., "floaters").

On the theoretical front, NTS establishes a tractable route for reconciling local expressiveness with global consistency in point-based graphical models. It suggests that shared global neural fields can regularize local appearance priors sufficiently to generalize across diverse tasks and input regimes.

Moving forward, research may focus on further optimizing computational bottlenecks via more advanced ray-Gaussian intersection algorithms, lightweight neural architectures, or hierarchical encoding schemes. Extension to large-scale outdoor scenes may necessitate augmenting or redesigning the global encoding to capture more complex, longer-range spatial regularities.

Conclusion

Neural Texture Splatting represents a significant advancement in explicit radiance field modeling, addressing the key limitations of primitive-level expressiveness, overfitting, and lack of spatiotemporal modeling in prior splatting frameworks. Through local RGBA tri-plane textures conditioned by a global neural field, NTS consistently achieves state-of-the-art results in sparse/dense-view synthesis, geometry, and 4D reconstruction. Its modular and efficient design is likely to inspire further research into expressive, generalizable, and efficient 3D and 4D scene representations.