Native HDR 3D Gaussian Splatting
- The paper introduces a novel luminance–chromaticity decomposition that decouples intensity from color for stable optimization in HDR 3D scene reconstruction.
- It leverages single-exposure linear HDR inputs to eliminate multi-exposure fusion, achieving state-of-the-art reconstruction quality with superior detail recovery.
- The pipeline integrates differentiable Gaussian splatting and advanced loss formulations to enable fast, physically-correct rendering of complex scenes.
Native High Dynamic Range 3D Gaussian Splatting (NH-3DGS) directly enables photorealistic 3D scene reconstruction from native high dynamic range (HDR) camera inputs, without reliance on multi-exposure merging or tone-mapping. NH-3DGS maintains the full, linear dynamic range throughout the entire pipeline—from raw HDR capture, through differentiable splatting and optimization, to final physically-correct rendering. The core technical contribution is a novel luminance–chromaticity decomposition of the per-Gaussian color representation, allowing direct and stable optimization in the high-dynamic-range regime. This approach achieves state-of-the-art HDR reconstruction quality on both synthetic and real datasets, with significant improvements in detail recovery and dynamic range preservation relative to previous LDR-based and multi-exposure HDR scene reconstruction methods (Zhang et al., 17 Nov 2025).
1. Pipeline for Native HDR 3D Gaussian Splatting
NH-3DGS operates on single-exposure linear HDR images, completely bypassing the multi-exposure or synthetic supervision constraints inherent to earlier HDR NeRF or tone-mapped 3DGS frameworks. The high-level sequence is as follows (Zhang et al., 17 Nov 2025):
- Capture: Acquire posed HDR frames using a native HDR imaging sensor or RAW camera, operating directly in the linear radiance domain. No multi-exposure fusion or post-capture tone-mapping is performed.
- Preprocessing: If the camera employs a nonlinear encoding, invert the response to recover linear radiance. For RAW inputs, the pipeline operates directly on Bayer-patterned sensor data, and pose estimation for all frames is performed via structure-from-motion/multi-view stereo (e.g., COLMAP).
- Gaussian Initialization: Distribute 3D Gaussians in the scene either by Poisson-disk sampling on a coarse mesh or via density clustering over a voxel grid. Each Gaussian is parameterized by:
- Mean position
- 3×3 covariance (typically encoded as anisotropic scales and Euler rotations)
- Scalar luminance
- Chromaticity spherical harmonic coefficients for
- Rendering & Differentiable Splatting: For each training image, ray-cast through pixels; splat each Gaussian's projected ellipse into the image using its mean and covariance. Compute the view-dependent color as discussed in Section 3. Accumulate opacities and colors in depth order using the standard fast gather or raster splatting scheme.
- Radiometric Compression & Loss: Before comparing predicted and ground truth HDR images, apply a differentiable -law operator to both images:
Then minimize a convex combination of and SSIM on the compressed values:
For RAW camera datasets, simulate Bayer sampling and compute a pattern-aware SSIM.
- Optimization: All parameters and, optionally, fine camera poses, are optimized via Adam. Losses are computed in the appropriate radiometric and sensor domains.
- Final Rendering: At test time, render novel views using the optimized splats, accumulating full-range HDR outputs. Tone-mapping is optional and performed only for visualization, not for internal computation.
2. 3D Gaussian Splat Representation and Luminance–Chromaticity Decomposition
NH-3DGS departs from classical 3DGS in its decomposition of per-Gaussian color into luminance and chromaticity (Zhang et al., 17 Nov 2025):
- Vanilla 3DGS: Each Gaussian holds spherical harmonics (SH) coefficients for RGB color, combined into a viewing-direction-dependent color as
where are real SH basis functions evaluated at viewing direction .
- NH-3DGS: Instead, per-Gaussian appearance is factored as
where is the scalar luminance and is the unit-normalized chromaticity, shared across color channels.
This design decouples the representation of overall radiance intensity (luminance) from the directionally varying color (chromaticity), resulting in more stable gradients during optimization and superior handling of scenes with severe variations in dynamic range and lighting (Zhang et al., 17 Nov 2025).
3. Loss Formulation and Optimization Strategy
The HDR dynamic range in both ground-truth (camera) and prediction necessitates special treatment for numerical stability and perceptual relevance.
- -Law Compression: The -law transformation compresses the wide range of radiance values (from deep shadow to bright highlight) for both target and prediction:
- Composite Losses: The learning objective is a sum of compressed photometric and SSIM structural similarity losses:
- RAW Sensor Handling: For RAW Bayer data, predicted RGB is converted to the RGGB pattern before loss evaluation, and a pattern-aware SSIM is employed (Zhang et al., 17 Nov 2025).
NH-3DGS uses full-batch optimization, progressive ray oversampling, and Adam with learning rates set distinctly for luminance (0.05) and other parameters (), converging in 20k–50k steps depending on the scene complexity.
4. Quantitative and Qualitative Performance
NH-3DGS achieves substantial improvements in HDR 3D reconstruction quality over prior methods:
| Task/Dataset | NH-3DGS | HDR-GS / Mono-HDR-GS | HDR-NeRF | Vanilla 3DGS (HDR) |
|---|---|---|---|---|
| Syn-8S PSNR (dB) | Best, +1.5–6.6 | — | +3.4 | +6.6 |
| Syn-8S SSIM | Highest | — | +0.04 | +0.06 |
| RAW-4S PSNR (dB) | 34.98 | — | 18.94 | — |
| Inf. Speed | 233 fps | 126/137 fps | 0.12 fps | — |
NH-3DGS yields the highest fidelity for both synthetic and real RAW-scenes, especially in challenging regions such as deep shadows and bright specular highlights. Qualitative results demonstrate preservation of shadow detail and absence of color casts in neutrals and highlights, which are not matched by prior multi-exposure or LDR-based HDR methods (Zhang et al., 17 Nov 2025).
5. Limitations and Future Extensions
NH-3DGS is constrained to static scenes with fixed illumination and requires accurate training-time camera pose initialization. Its current chromaticity formulation, while effective for typical illumination, may not capture highly anisotropic BRDFs or strong diffraction/specular effects—these would require either higher-order SHs or the integration of explicit microfacet or neural BRDF models (Zhang et al., 17 Nov 2025).
Potential research extensions include:
- Joint pose refinement within the NH-3DGS optimization
- Hybrid Gaussians + MLP branches for extreme view-dependent reflectance
- Adaptive sampling/densification to allocate splats in complex regions (e.g., shadow boundaries)
- Extensions to time-varying (dynamic) scenes and temporally consistent HDR splatting
- Handling mixed HDR/LDR capture via spatially-varying camera response estimation
6. Implementation Details and Comparative Context
NH-3DGS uses a modular PyTorch/CUDA implementation, extending the standard 3DGS codebase. Datasets are stored as .EXR or RAW formats, with associated COLMAP camera poses and auxiliary scripts. A single NVIDIA RTX 4090 processes full scenes for training in 2–6 hours (scene dependent), and renders novel views at 200–250 fps.
Relative to alternative HDR 3DGS methods (HDR-GS (Wu et al., 2024), HDRSplat (Singh et al., 2024), and CasualHDRSplat (Gong et al., 24 Apr 2025)), NH-3DGS is the first to provide a direct, native linear HDR pipeline with an explicit luminance–chromaticity split, specialized for data from single-exposure HDR cameras rather than LDR or fused multi-exposure sequences (Zhang et al., 17 Nov 2025). Its approach to color representation and gradient stabilization is uniquely effective in full-range HDR conditions and remains competitive on both synthetic benchmarks and real-world scenes.