Papers
Topics
Authors
Recent
2000 character limit reached

Native HDR 3D Gaussian Splatting

Updated 18 November 2025
  • The paper introduces a novel luminance–chromaticity decomposition that decouples intensity from color for stable optimization in HDR 3D scene reconstruction.
  • It leverages single-exposure linear HDR inputs to eliminate multi-exposure fusion, achieving state-of-the-art reconstruction quality with superior detail recovery.
  • The pipeline integrates differentiable Gaussian splatting and advanced loss formulations to enable fast, physically-correct rendering of complex scenes.

Native High Dynamic Range 3D Gaussian Splatting (NH-3DGS) directly enables photorealistic 3D scene reconstruction from native high dynamic range (HDR) camera inputs, without reliance on multi-exposure merging or tone-mapping. NH-3DGS maintains the full, linear dynamic range throughout the entire pipeline—from raw HDR capture, through differentiable splatting and optimization, to final physically-correct rendering. The core technical contribution is a novel luminance–chromaticity decomposition of the per-Gaussian color representation, allowing direct and stable optimization in the high-dynamic-range regime. This approach achieves state-of-the-art HDR reconstruction quality on both synthetic and real datasets, with significant improvements in detail recovery and dynamic range preservation relative to previous LDR-based and multi-exposure HDR scene reconstruction methods (Zhang et al., 17 Nov 2025).

1. Pipeline for Native HDR 3D Gaussian Splatting

NH-3DGS operates on single-exposure linear HDR images, completely bypassing the multi-exposure or synthetic supervision constraints inherent to earlier HDR NeRF or tone-mapped 3DGS frameworks. The high-level sequence is as follows (Zhang et al., 17 Nov 2025):

  1. Capture: Acquire NN posed HDR frames {I1,...,IN}\{I_1, ..., I_N\} using a native HDR imaging sensor or RAW camera, operating directly in the linear radiance domain. No multi-exposure fusion or post-capture tone-mapping is performed.
  2. Preprocessing: If the camera employs a nonlinear encoding, invert the response to recover linear radiance. For RAW inputs, the pipeline operates directly on Bayer-patterned sensor data, and pose estimation for all frames is performed via structure-from-motion/multi-view stereo (e.g., COLMAP).
  3. Gaussian Initialization: Distribute MM 3D Gaussians in the scene either by Poisson-disk sampling on a coarse mesh or via density clustering over a voxel grid. Each Gaussian ii is parameterized by:
    • Mean position μiR3\mu_i \in \mathbb{R}^3
    • 3×3 covariance Σi\Sigma_i (typically encoded as anisotropic scales and Euler rotations)
    • Scalar luminance Li>0L_i > 0
    • Chromaticity spherical harmonic coefficients {bi,lm}\{b_{i,l}^m\} for 0lL,lml0 \leq l \leq L, -l \leq m \leq l
  4. Rendering & Differentiable Splatting: For each training image, ray-cast through pixels; splat each Gaussian's projected ellipse into the image using its mean and covariance. Compute the view-dependent color ci(d)c_i(d) as discussed in Section 3. Accumulate opacities and colors in depth order using the standard fast gather or raster splatting scheme.
  5. Radiometric Compression & Loss: Before comparing predicted and ground truth HDR images, apply a differentiable μ\mu-law operator to both images:

I~=log(1+μI)log(1+μ),μ5000,\tilde{I} = \frac{\log (1 + \mu \cdot I)}{\log (1+\mu)}, \quad \mu \approx 5000,

Then minimize a convex combination of L1L_1 and SSIM on the compressed values:

L=λL1(I~pred,I~gt)+(1λ)LSSIM(I~pred,I~gt),λ0.2.\mathcal{L} = \lambda \cdot \mathcal{L}_1(\tilde{I}_\text{pred}, \tilde{I}_\text{gt}) + (1-\lambda) \cdot \mathcal{L}_\text{SSIM}(\tilde{I}_\text{pred}, \tilde{I}_\text{gt}), \quad \lambda \approx 0.2.

For RAW camera datasets, simulate Bayer sampling and compute a pattern-aware SSIM.

  1. Optimization: All parameters {μi,Σi,Li,bi,lm}\{\mu_i, \Sigma_i, L_i, b_{i,l}^m\} and, optionally, fine camera poses, are optimized via Adam. Losses are computed in the appropriate radiometric and sensor domains.
  2. Final Rendering: At test time, render novel views using the optimized splats, accumulating full-range HDR outputs. Tone-mapping is optional and performed only for visualization, not for internal computation.

2. 3D Gaussian Splat Representation and Luminance–Chromaticity Decomposition

NH-3DGS departs from classical 3DGS in its decomposition of per-Gaussian color into luminance and chromaticity (Zhang et al., 17 Nov 2025):

  • Vanilla 3DGS: Each Gaussian holds spherical harmonics (SH) coefficients ki,lmR3k_{i,l}^m \in \mathbb{R}^3 for RGB color, combined into a viewing-direction-dependent color as

ci(d)=l=0Lm=llki,lmYlm(d),c_i(d) = \sum_{l=0}^L \sum_{m=-l}^l k_{i,l}^m \cdot Y_l^m(d),

where YlmY_l^m are real SH basis functions evaluated at viewing direction dd.

  • NH-3DGS: Instead, per-Gaussian appearance is factored as

ci(d)=Lifi(d),fi(d)=l,mbi,lmYlm(d),c_i(d) = L_i \cdot f_i(d), \qquad f_i(d) = \sum_{l,m} b_{i,l}^m Y_l^m(d),

where LiR+L_i \in \mathbb{R}^{+} is the scalar luminance and fi(d)[0,1]3f_i(d) \in [0,1]^3 is the unit-normalized chromaticity, shared across color channels.

This design decouples the representation of overall radiance intensity (luminance) from the directionally varying color (chromaticity), resulting in more stable gradients during optimization and superior handling of scenes with severe variations in dynamic range and lighting (Zhang et al., 17 Nov 2025).

3. Loss Formulation and Optimization Strategy

The HDR dynamic range in both ground-truth (camera) and prediction necessitates special treatment for numerical stability and perceptual relevance.

  • μ\mu-Law Compression: The μ\mu-law transformation compresses the wide range of radiance values (from deep shadow to bright highlight) for both target and prediction:

I~=log(1+μI)log(1+μ),\tilde{I} = \frac{\log (1 + \mu \cdot I)}{\log (1+\mu)},

  • Composite Losses: The learning objective is a sum of compressed L1L_1 photometric and SSIM structural similarity losses:

L=λL1+(1λ)LSSIM,λ0.2.\mathcal{L} = \lambda \cdot \mathcal{L}_1 + (1-\lambda) \cdot \mathcal{L}_\text{SSIM}, \quad \lambda \approx 0.2.

  • RAW Sensor Handling: For RAW Bayer data, predicted RGB is converted to the RGGB pattern before loss evaluation, and a pattern-aware SSIM is employed (Zhang et al., 17 Nov 2025).

NH-3DGS uses full-batch optimization, progressive ray oversampling, and Adam with learning rates set distinctly for luminance (0.05) and other parameters (10210^{-2}), converging in 20k–50k steps depending on the scene complexity.

4. Quantitative and Qualitative Performance

NH-3DGS achieves substantial improvements in HDR 3D reconstruction quality over prior methods:

Task/Dataset NH-3DGS HDR-GS / Mono-HDR-GS HDR-NeRF Vanilla 3DGS (HDR)
Syn-8S PSNR (dB) Best, +1.5–6.6 +3.4 +6.6
Syn-8S SSIM Highest +0.04 +0.06
RAW-4S PSNR (dB) 34.98 18.94
Inf. Speed 233 fps 126/137 fps 0.12 fps

NH-3DGS yields the highest fidelity for both synthetic and real RAW-scenes, especially in challenging regions such as deep shadows and bright specular highlights. Qualitative results demonstrate preservation of shadow detail and absence of color casts in neutrals and highlights, which are not matched by prior multi-exposure or LDR-based HDR methods (Zhang et al., 17 Nov 2025).

5. Limitations and Future Extensions

NH-3DGS is constrained to static scenes with fixed illumination and requires accurate training-time camera pose initialization. Its current chromaticity formulation, while effective for typical illumination, may not capture highly anisotropic BRDFs or strong diffraction/specular effects—these would require either higher-order SHs or the integration of explicit microfacet or neural BRDF models (Zhang et al., 17 Nov 2025).

Potential research extensions include:

  • Joint pose refinement within the NH-3DGS optimization
  • Hybrid Gaussians + MLP branches for extreme view-dependent reflectance
  • Adaptive sampling/densification to allocate splats in complex regions (e.g., shadow boundaries)
  • Extensions to time-varying (dynamic) scenes and temporally consistent HDR splatting
  • Handling mixed HDR/LDR capture via spatially-varying camera response estimation

6. Implementation Details and Comparative Context

NH-3DGS uses a modular PyTorch/CUDA implementation, extending the standard 3DGS codebase. Datasets are stored as .EXR or RAW formats, with associated COLMAP camera poses and auxiliary scripts. A single NVIDIA RTX 4090 processes full scenes for training in 2–6 hours (scene dependent), and renders novel views at 200–250 fps.

Relative to alternative HDR 3DGS methods (HDR-GS (Wu et al., 2024), HDRSplat (Singh et al., 2024), and CasualHDRSplat (Gong et al., 24 Apr 2025)), NH-3DGS is the first to provide a direct, native linear HDR pipeline with an explicit luminance–chromaticity split, specialized for data from single-exposure HDR cameras rather than LDR or fused multi-exposure sequences (Zhang et al., 17 Nov 2025). Its approach to color representation and gradient stabilization is uniquely effective in full-range HDR conditions and remains competitive on both synthetic benchmarks and real-world scenes.

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Native High Dynamic Range 3D Gaussian Splatting (NH-3DGS).