Neural Gabor Splatting: High-Frequency Scene Rendering
- Neural Gabor Splatting is an explicit scene representation that augments 3D Gaussian primitives with analytic Gabor waves or lightweight neural MLPs to model high-frequency, view-dependent textures.
- It employs differentiable volume rendering and a frequency-aware optimization strategy, enabling real-time performance and a significant reduction in required primitives.
- NGS achieves superior metrics in PSNR, SSIM, and LPIPS on high-frequency benchmarks, demonstrating efficient capacity utilization for intricate surface detail reconstruction.
Neural Gabor Splatting (NGS) is an explicit neural scene representation that augments each 3D Gaussian primitive with the capacity to model high-frequency spatial and view-dependent texture within single primitives. NGS addresses the critical limitation of prior 3D Gaussian Splatting (3DGS) approaches, where representing sharp appearance transitions necessitated the deployment of excessively many primitives, leading to redundancy, elevated memory requirements, and inefficiency. By embedding either analytic Gabor wave patterns or lightweight neural functions into each Gaussian, NGS enables faithful reconstruction and real-time rendering of scenes with intricate high-frequency details using orders of magnitude fewer elements than prior methods (Watanabe et al., 15 Apr 2025, Watanabe et al., 17 Apr 2026).
1. Mathematical Formulation
Each NGS primitive, or splat, is defined by a geometric 3D Gaussian combined with a parametric function capable of producing spatial oscillations to encode high-frequency color patterns. The canonical formulation is as follows.
- Gaussian Envelope: Each splat is centered at with covariance . The envelope function at position is
- Local Parameterization: A local 2D plane around is defined by orthonormal directions and scale factors , with coordinates such that .
- High-frequency Term: In explicit Gabor Splatting, color is modulated by sums of cosines over multiple pre-set evenly spaced orientations:
where
0
- Neural Gabor (MLP) Formulation: Each splat's color is output by an MLP with sinusoidal activations (SIREN), taking as input 1, with 2 the viewing direction:
3
With hidden size 4, this architecture enables encoding complex patterning with only 5 parameters per primitive (Watanabe et al., 17 Apr 2026).
2. Rendering Pipeline
NGS represents the scene as a union of 6 splats, each carrying its spatial, color, and frequency (or neural) parameters. Rendering proceeds via differentiable accumulation along camera rays:
- Volume Rendering Equation:
7
with
8
- Rasterization Variant: Instead of volume integration, the standard raster-then-blend scheme projects each splat to a 2D ellipse in screen space and composes their colors with opacity-aware alpha blending in (potentially depth-sorted) order.
- Color Evaluation: For neural-Gabor splats, pixel color is a sum of each splat's (opacity-weighted) MLP output at the relevant 9, modulated by the splat envelope and accumulated with associated transmittance.
This formulation is fully differentiable, permitting backpropagation for optimization of both geometry and per-splat high-frequency appearance (Watanabe et al., 15 Apr 2025, Watanabe et al., 17 Apr 2026).
3. Optimization and Densification Strategies
NGS leverages both conventional and frequency-aware optimization and densification.
- Losses: Canonical supervision uses a combination of photometric error (e.g., 0 or 1), structural similarity loss (SSIM), and, optionally, geometric regularizers:
2
- Optimization: All splat parameters (geometry, envelope, color, MLP, frequency) are optimized end-to-end with Adam over tens of thousands of steps. Backpropagation through both spatial and frequency parameters ensures that learned Gabor/MLP frequencies lock onto dominant scene texture modes (Watanabe et al., 15 Apr 2025, Watanabe et al., 17 Apr 2026).
- Frequency-Aware Densification: Instead of relying solely on spatial or photometric error, NGS measures frequency-domain reconstruction error using a band-limited FFT-based procedure, mapping errors back to contributing primitives. Primitives with excess frequency error are cloned or split, inheriting MLP weights but resetting opacities to promote efficient capacity allocation where needed (Watanabe et al., 17 Apr 2026).
4. High-frequency Surface Texture Modeling
NGS is specifically designed to address the inability of standard 3DGS or related explicit point/volumetric methods to efficiently represent sharp, repeated, or oscillatory surface patterns.
- Capacity Efficiency: Adding Gabor or neural components permits each splat to encode multiple cycles of high-frequency texture, drastically reducing the required number of primitives compared to standard Gaussians. Where plain 3DGS requires 3 splats for stripe width 4, NGS can represent such texture with very few elements (Watanabe et al., 15 Apr 2025).
- MLP vs. Analytic Gabor: Neural MLPs can represent both spatially varying and view-dependent high-frequency phenomena more flexibly than fixed Gabor wave bases, outperforming explicit Gabor splats by approximately 5 dB PSNR on demanding high-frequency benchmarks (Watanabe et al., 17 Apr 2026).
- Empirical Results: On synthetic high-frequency datasets and benchmarks such as Mip-NeRF360, NGS yields significant improvements in PSNR, SSIM, and LPIPS at fixed memory budgets relative to 3DGS and 2DGS. For example, at equal primitive counts, NGS attains PSNR=6/7 (NGS/3DGS), SSIM=8/9, and LPIPS=0/1 (Watanabe et al., 17 Apr 2026).
5. Implementation and Rendering Performance
NGS is implemented following a standard photogrammetry and splat optimization pipeline, supporting high-quality and real-time applications.
- Initialization: Structure-from-motion (SfM, e.g., COLMAP) provides 3D point clouds and camera extrinsics. Splats are seeded on points, initialized with isotropic covariances and random local Gabor or MLP parameters.
- Optimization Regimen: Per-primitive parameters are trained for 2–3 iterations using batches of several thousand rays, with frequency-aware densification every 4 steps. In NGS, each MLP is a one-hidden-layer SIREN with six neurons and input dimension five.
- Rendering Rates: GPU rasterization, with each (possibly pruned) splat projected as a 2D ellipse and per-pixel 5 inversion, enables interactive rates (tens to hundreds of FPS). On complex models, 3DGS achieves up to 6 FPS at 7 memory budget, whereas NGS with its heavier per-splat neural evaluation achieves 8 FPS at 9 budget (Watanabe et al., 17 Apr 2026).
- Resource Footprint: The increased per-splat compute and memory overhead due to embedded MLPs is offset by the drastic reduction in the number of required splats for complex surfaces.
6. Empirical Evaluation and Ablations
NGS has been benchmarked extensively against canonical 3DGS and hybrid approaches on synthetic and real-world datasets for both reconstruction and novel-view synthesis.
- Datasets: High-frequency synthetic (checkerboard, fur), Mip-NeRF360 (seven scenes), DTU (four scenes), and Tanks&Temples were used, with rigorous assessment in terms of PSNR, SSIM, LPIPS, memory/primitive count, and FPS (Watanabe et al., 17 Apr 2026).
- Ablation Studies:
- Frequency-aware densification localizes capacity to high-frequency error regions, giving PSNR/SSIM/LPIPS comparable to full-image error-based splits but with more targeted allocation.
- Removing view-direction input or reducing MLP width decreases PSNR by 0 dB, highlighting the value of per-splat view dependence.
- Compared to explicit Gabor basis splats, neural Gabor (MLP) splats consistently outperform in metrics and qualitative fidelity.
- Budget-Quality Tradeoff: Under extreme reductions in primitive count (1–5% of baseline), NGS degrades gracefully, maintaining structure and appearance, while 3DGS and related baselines exhibit rapid collapse in fidelity.
7. Limitations and Prospective Directions
- Training Overhead: The inclusion of lightweight MLPs (~57 parameters per splat) approximately doubles training time relative to 2DGS, though rendering remains interactive (Watanabe et al., 17 Apr 2026).
- Low-frequency Scenes: For simple, low-frequency surfaces, the extra modeling power of MLPs is underutilized; adaptive activation or switching could be considered.
- Scope of Application: The current formulation is not tailored for volumetric phenomena (e.g., participating media, translucency) or dynamic (4D) scenes.
- Future Extensions: Directions include shared MLP codebooks to further reduce per-primitive memory, extension to dynamic scenes, and integration with learned geometry or hybrid volumetric/surface representations.
In summary, Neural Gabor Splatting provides a highly efficient, real-time capable explicit scene representation optimized for high-frequency surface detail, unifying geometric, analytic, and neural modeling within a unifying differentiable framework (Watanabe et al., 15 Apr 2025, Watanabe et al., 17 Apr 2026).