Grid-Refined HDR Method
- Grid-refined HDR methods are frameworks that reconstruct HDR 3D radiance fields from multi-exposure LDR images using structured, grid-based representations.
- They employ a coarse-to-fine optimization strategy that integrates 3D Gaussian Splatting with a learnable asymmetric grid for effective tone mapping.
- Empirical evaluations demonstrate substantial improvements in PSNR, SSIM, and real-time rendering performance compared to traditional HDR reconstruction techniques.
Grid-refined HDR methods represent a class of frameworks for reconstructing high dynamic range (HDR) 3D radiance fields from multi-exposure low dynamic range (LDR) images, using structured, grid-based representations to bridge the gap between LDR supervision and HDR scene reproduction. The grid-refined approach is central to High Dynamic Range Gaussian Splatting (HDR-GS), which builds on the Gaussian Splatting (3DGS) paradigm to achieve real-time, high-fidelity scene capture by leveraging a learned, asymmetric grid for tone mapping and a coarse-to-fine optimization schedule. By explicitly modeling the tone-mapping process via a non-uniformly sampled grid and integrating exposure-time scaling, this method achieves robust and accurate mapping from log-irradiance to observed LDR values across varying exposures, overcoming key limitations of both implicit-MLP and traditional grid-based methods (Wu et al., 13 Aug 2024).
1. Gaussian Splatting Foundation and HDR Parameterization
HDR-GS employs the 3D Gaussian Splatting representation as its foundational basis, wherein a scene is modeled as a sparse set of parameterized 3D Gaussians. Each Gaussian is described by:
- Mean
- Covariance , enforced positive semi-definite via (with rotation and scale )
- Radiance , directly encoding luminance instead of traditional RGB spherical harmonics
- Opacity (derived from the 3D Gaussian’s covariance under projection)
Projection to image space is accomplished via camera extrinsics/intrinsics, where the per-pixel HDR irradiance is computed as:
Initialization leverages a Structure-from-Motion (SfM) point cloud, with Gaussian means and covariances seeded as in prior 3DGS work and initial radiance values set to align the rendered log-irradiance exposure with LDR observations under a fixed tone-mapping curve (Wu et al., 13 Aug 2024).
2. Asymmetric Grid Construction for Tone Mapping
To relate physical imaging to final LDR output, HDR-GS formulates pixel values as: where is the (invertible, monotonic) camera response function and is exposure time. Through transformation, this can be rewritten as:
A learnable 1D lookup table ("asymmetric grid") parameterizes over a domain , using dense (128 nodes/unit) sampling near typical values and coarser (64 nodes/unit) sampling for tails to reduce overfitting. For differentiable handling of out-of-domain queries, a "leaky" grid extension is employed: with , , and .
To account for disparate exposure times and the resulting "holes" in -space, exposures are remapped as: This causes all exposures to occupy a contiguous grid-supporting region during optimization.
The final rendering equation becomes: where is the learned log-irradiance. Post-training, true HDR irradiance is recovered as:
3. Coarse-to-Fine Training Strategy
Directly optimizing both Gaussian parameters and discrete grid entries can result in local minima. To address this, HDR-GS uses a two-phase optimization schedule:
Phase 1 (Coarse):
- Tone-mapping grid is fixed to a sigmoid .
- Only Gaussian parameters are optimized.
- Typically run for 6,000 to 14,000 iterations (50 s on A100 GPU).
Phase 2 (Fine):
- Switch to learnable asymmetric grid .
- Joint optimization of grid values and Gaussians proceeds for 17,000 to 30,000 iterations (4–8 min).
This staged approach expedites convergence and increases robustness to viewpoint sparsity and exposure-range extremes, while mitigating suboptimal local convergence (Wu et al., 13 Aug 2024).
4. Loss Formulation and Optimization
The total loss combines three main terms:
- Reconstruction loss:
ensuring photometric and structural accuracy.
- Smoothness loss (for the tone-mapping CRF grid):
to enforce smoothness of the learned response curve.
- Unit exposure loss:
which anchors overall tone-mapping scale to consistent physical interpretation.
The aggregate objective is: with recommended hyperparameters of for synthetic scenes and for real scenes. The Adam optimizer is utilized, with the grid learning rate decayed from to .
5. Empirical Evaluation and Performance Metrics
HDR-GS and its grid-refined HDR method are evaluated on both synthetic and real-world datasets. Key metrics include:
- LDR (novel view): PSNR (dB), SSIM, LPIPS for exposures in LDR-OE () and LDR-NE ()
- HDR perceptual: HDR-VDP Q-score (0–10), PU-PSNR, PU-SSIM (using PU21 encoder)
- Performance: Training time (coarse: 50 s + fine: 4–8 min on A100, GB GPU RAM), rendering speed (210–390 FPS)
A comparative summary for LDR-OE metrics across 8 synthetic scenes is:
| Method | PSNR (dB) | SSIM | LPIPS | Time | FPS |
|---|---|---|---|---|---|
| HDR-NeRF | 36.72 | 0.951 | 0.044 | 9.4 h | <1 |
| 3DGS | 11.24 | 0.395 | 0.534 | 12.7 min | 352 |
| HDR-GS (Ours) | 39.16 | 0.974 | 0.012 | 6.8 min | 248.8 |
For HDR metrics (LDR-OE), HDR-GS attains HDR-VDP 9.70, PU-PSNR 23.67 dB, and PU-SSIM 0.835, outperforming HDR-NeRF (VDP 6.56, PU-PSNR 20.33 dB, PU-SSIM 0.527) (Wu et al., 13 Aug 2024). On real data, HDR-GS yields PSNR 33.34 dB, SSIM 0.967, LPIPS 0.023 in 8.5 min versus HDR-NeRF’s 31.88/0.950/0.067 in 8.2 h.
6. Implementation Considerations and Limitations
HDR-GS applies a pruning strategy (as in Niemeyer et al., 2024) to remove Gaussians with low ray-contribution (score below 0.02 every 200 iterations), which aids memory efficiency and speed. The method is computationally efficient, training fully on an A100 GPU (<5 GB, <10 min) and supporting real-time rendering (>200 FPS).
Key limitations include omission of aperture size and ISO gain in the imaging model, which could impact fidelity in certain scenarios. The 3DGS basis also struggles with accurately modeling transparent surfaces.
This suggests that while grid-refined HDR methods successfully address many challenges of HDR scene reconstruction from multi-exposure LDR data, further refinement may be needed for scenarios involving complex light transport or physically-based camera models (Wu et al., 13 Aug 2024).