Grid-Refined HDR Method

Updated 27 November 2025

Grid-refined HDR methods are frameworks that reconstruct HDR 3D radiance fields from multi-exposure LDR images using structured, grid-based representations.
They employ a coarse-to-fine optimization strategy that integrates 3D Gaussian Splatting with a learnable asymmetric grid for effective tone mapping.
Empirical evaluations demonstrate substantial improvements in PSNR, SSIM, and real-time rendering performance compared to traditional HDR reconstruction techniques.

Grid-refined HDR methods represent a class of frameworks for reconstructing high dynamic range (HDR) 3D radiance fields from multi-exposure low dynamic range (LDR) images, using structured, grid-based representations to bridge the gap between LDR supervision and HDR scene reproduction. The grid-refined approach is central to High Dynamic Range Gaussian Splatting (HDR-GS), which builds on the Gaussian Splatting (3DGS) paradigm to achieve real-time, high-fidelity scene capture by leveraging a learned, asymmetric grid for tone mapping and a coarse-to-fine optimization schedule. By explicitly modeling the tone-mapping process via a non-uniformly sampled grid and integrating exposure-time scaling, this method achieves robust and accurate mapping from log-irradiance to observed LDR values across varying exposures, overcoming key limitations of both implicit-MLP and traditional grid-based methods (Wu et al., 13 Aug 2024).

1. Gaussian Splatting Foundation and HDR Parameterization

HDR-GS employs the 3D Gaussian Splatting representation as its foundational basis, wherein a scene is modeled as a sparse set of parameterized 3D Gaussians. Each Gaussian $i$ is described by:

Mean $\mu_{3D,i} \in \mathbb{R}^3$
Covariance $\Sigma_{3D,i} \in \mathbb{R}^{3\times3}$ , enforced positive semi-definite via $\Sigma_{3D} = R S S^\top R^\top$ (with rotation $R$ and scale $S$ )
Radiance $L'_i \in \mathbb{R}^3$ , directly encoding luminance instead of traditional RGB spherical harmonics
Opacity $\alpha_i$ (derived from the 3D Gaussian’s covariance under projection)

Projection to image space is accomplished via camera extrinsics/intrinsics, where the per-pixel HDR irradiance is computed as: $E(p) = \sum_{i=1}^n L'_i \alpha_i \prod_{j<i}(1-\alpha_j)$

Initialization leverages a Structure-from-Motion (SfM) point cloud, with Gaussian means and covariances seeded as in prior 3DGS work and initial radiance values set to align the rendered log-irradiance exposure with LDR observations under a fixed tone-mapping curve (Wu et al., 13 Aug 2024).

2. Asymmetric Grid Construction for Tone Mapping

To relate physical imaging to final LDR output, HDR-GS formulates pixel values as: $C(p, t) = F(E(p) \cdot t)$ where $F$ is the (invertible, monotonic) camera response function and $t$ is exposure time. Through transformation, this can be rewritten as: $C = g(\ln E + \ln t), \quad g = (\ln F^{-1})^{-1}$

A learnable 1D lookup table ("asymmetric grid") parameterizes $g(x)$ over a domain $[a, b]$ , using dense (128 nodes/unit) sampling near typical values and coarser (64 nodes/unit) sampling for tails to reduce overfitting. For differentiable handling of out-of-domain queries, a "leaky" grid extension is employed: $g_{leaky}(x) = \begin{cases} \beta(x-a) & \text{if } x < a \ g(x) & \text{if } a \leq x \leq b \ -\frac{\beta}{\sqrt{x-b+1} + \beta + 1} & \text{if } x > b \end{cases}$ with $\beta = 0.01$ , $g(a) = 0$ , and $g(b) = 1$ .

To account for disparate exposure times and the resulting "holes" in $x$ -space, exposures are remapped as: $t' = r \cdot \ln t + s,\quad r = \min_i \left\{ 2 \frac{\ln t_i}{\ln t_{i+1}} \right\},\quad s = -\frac{r(\ln t_{max} + \ln t_{min})}{2}$ This causes all exposures to occupy a contiguous grid-supporting region during optimization.

The final rendering equation becomes: $C(p, t) = g_{leaky}(E'(p) + t'(p))$ where $E'(p)$ is the learned log-irradiance. Post-training, true HDR irradiance is recovered as: $E = f_H(E') = \exp\left(\frac{E'+s}{r}\right)$

3. Coarse-to-Fine Training Strategy

Directly optimizing both Gaussian parameters and discrete grid entries can result in local minima. To address this, HDR-GS uses a two-phase optimization schedule:

Phase 1 (Coarse):

Tone-mapping grid is fixed to a sigmoid $g_s(x) = 1/(1+\exp(-x))$ .
Only Gaussian parameters $\{\mu_{3D,i}, \Sigma_{3D,i}, L'_i\}$ are optimized.
Typically run for 6,000 to 14,000 iterations ( $\sim$ 50 s on A100 GPU).

Phase 2 (Fine):

Switch to learnable asymmetric grid $g(x)$ .
Joint optimization of grid values and Gaussians proceeds for 17,000 to 30,000 iterations ( $\sim$ 4–8 min).

This staged approach expedites convergence and increases robustness to viewpoint sparsity and exposure-range extremes, while mitigating suboptimal local convergence (Wu et al., 13 Aug 2024).

4. Loss Formulation and Optimization

The total loss combines three main terms:

Reconstruction loss:

$L_{rec} = (1-\lambda_1)\|C_{pred} - C_{gt}\|_1 + \lambda_1 L_{D-SSIM}(C_{pred}, C_{gt})$

ensuring photometric and structural accuracy.

Smoothness loss (for the tone-mapping CRF grid):

$L_{smooth} = \sum_{i=1}^N \sum_{e \in [a, b]} (g_i''(e))^2$

to enforce smoothness of the learned response curve.

Unit exposure loss:

$L_u = \|g(0) - C_0\|^2_2, \quad C_0=0.73$

which anchors overall tone-mapping scale to consistent physical interpretation.

The aggregate objective is: $L_{total} = L_{rec} + \lambda_2 L_{smooth} + \lambda_3 L_u$ with recommended hyperparameters of $\lambda_2=0.3, \lambda_3=0.5$ for synthetic scenes and $\lambda_1=0.2, \lambda_2=10^{-3}, \lambda_3=0$ for real scenes. The Adam optimizer is utilized, with the grid learning rate decayed from $2 \times 10^{-2}$ to $5 \times 10^{-6}$ .

5. Empirical Evaluation and Performance Metrics

HDR-GS and its grid-refined HDR method are evaluated on both synthetic and real-world datasets. Key metrics include:

LDR (novel view): PSNR (dB), SSIM, LPIPS for exposures in LDR-OE ( $\{t_1, t_3, t_5\}$ ) and LDR-NE ( $\{t_2, t_4\}$ )
HDR perceptual: HDR-VDP Q-score (0–10), PU-PSNR, PU-SSIM (using PU21 encoder)
Performance: Training time (coarse: $\sim$ 50 s + fine: 4–8 min on A100, $<5$ GB GPU RAM), rendering speed (210–390 FPS)

A comparative summary for LDR-OE metrics across 8 synthetic scenes is:

Method	PSNR (dB)	SSIM	LPIPS	Time	FPS
HDR-NeRF	36.72	0.951	0.044	9.4 h	<1
3DGS	11.24	0.395	0.534	12.7 min	352
HDR-GS (Ours)	39.16	0.974	0.012	6.8 min	248.8

For HDR metrics (LDR-OE), HDR-GS attains HDR-VDP 9.70, PU-PSNR 23.67 dB, and PU-SSIM 0.835, outperforming HDR-NeRF (VDP 6.56, PU-PSNR 20.33 dB, PU-SSIM 0.527) (Wu et al., 13 Aug 2024). On real data, HDR-GS yields PSNR 33.34 dB, SSIM 0.967, LPIPS 0.023 in 8.5 min versus HDR-NeRF’s 31.88/0.950/0.067 in 8.2 h.

6. Implementation Considerations and Limitations

HDR-GS applies a pruning strategy (as in Niemeyer et al., 2024) to remove Gaussians with low ray-contribution (score below 0.02 every 200 iterations), which aids memory efficiency and speed. The method is computationally efficient, training fully on an A100 GPU (<5 GB, <10 min) and supporting real-time rendering (>200 FPS).

Key limitations include omission of aperture size and ISO gain in the imaging model, which could impact fidelity in certain scenarios. The 3DGS basis also struggles with accurately modeling transparent surfaces.

This suggests that while grid-refined HDR methods successfully address many challenges of HDR scene reconstruction from multi-exposure LDR data, further refinement may be needed for scenarios involving complex light transport or physically-based camera models (Wu et al., 13 Aug 2024).

PDF Markdown Chat (Pro)

References (1)

HDRGS: High Dynamic Range Gaussian Splatting (2024)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Grid-Refined HDR Method.