Holographic Radiance Cascades for 2D Global Illumination (2505.02041v1)

Published 4 May 2025 in cs.GR

Abstract: Efficiently calculating global illumination has always been one of the greatest challenges in computer graphics. Algorithms for approximating global illumination have always struggled to run in realtime for fully dynamic scenes, and have had to rely heavily on stochastic raytracing, spatialtemporal denoising, or undersampled representations, resulting in much lower quality of lighting compared to reference solutions. Even though the problem of calculating global illumination in 2D is significantly simpler than that of 3D, most contemporary approaches still struggle to accurately approximate 2D global illumination under realtime constraints. We present Holographic Radiance Cascades: a new single-shot scene-agnostic radiance transfer algorithm for global illumination, which is capable of achieving results visually indistinguishable from the 2D reference solution at realtime framerates. Our method uses a multi-level radiance probe system, and computes rays via combining short ray intervals as a replacement for conventional raytracing. It runs at constant cost for a given scene size, taking 1.85ms for a 512x512 pixel image and 7.67ms for 1024x1024 on an RTX 3080 Laptop.

Summary

The paper introduces Holographic Radiance Cascades (HRC) as a real-time, single-shot method for computing 2D global illumination without needing stochastic ray tracing.
It refines the existing Radiance Cascades framework by modifying probe structures and using a specialized acceleration structure to maintain high spatial resolution along incoming light directions.
Experimental results show lower RMSE and faster computation times compared to path tracing, despite limitations with small light sources and challenges in scaling to 3D.

The paper "Holographic Radiance Cascades for 2D Global Illumination" (2505.02041) introduces a novel, real-time, single-shot algorithm for calculating 2D global illumination (GI) in dynamic scenes. The method, called Holographic Radiance Cascades (HRC), builds upon the Radiance Cascades (RC) framework but modifies the core probe structure and introduces a specialized acceleration structure to improve performance and quality, particularly for complex geometry and volumetric effects, while eliminating the need for stochastic ray tracing or denoising.

Core Concepts

The paper defines the problem as calculating fluence $F(p)$ , the radiance integrated over all incoming directions at a point $p$ . Unlike irradiance, this doesn't involve a surface cosine term, making it suitable for calculating light within participating media, which the authors model surfaces as highly-dense volumes of.

Key to the method is the concept of a radiance interval $L_T(p \leftarrow q)$ and its associated transmittance $T_T(p \leftarrow q)$ , representing the light contributed by the segment from $q$ to $p$ and the fraction of light transmitted over that segment, respectively. These satisfy a merging property analogous to premultiplied alpha blending:

$L_T(p \leftarrow r) = L_T(p \leftarrow q) + T_T(p \leftarrow q) \cdot L_T(q \leftarrow r)$

$T_T(p \leftarrow r) = T_T(p \leftarrow q) \cdot T_T(q \leftarrow r)$

where $p, q, r$ are colinear with $q$ between $p$ and $r$ . This allows computing the radiance and transmittance of a long segment by combining shorter ones. The paper defines Trace(p, q) as the pair $(L_T(p \leftarrow q), T_T(p \leftarrow q))$ .

Holographic Radiance Cascades (HRC)

The HRC method addresses limitations in standard Radiance Cascades [Sannikov 2023; Osborne and Sannikov 2024] related to resolving penumbras from distant lights. Standard RC uses a multi-resolution grid of probes, where each cascade level has half the spatial resolution and quadruples the angular resolution of the previous one. While efficient for diffuse GI, this can fail to resolve small penumbras if the probe spacing at the relevant cascade level is larger than the penumbra width.

HRC modifies the probe structure to maintain high spatial resolution perpendicular to the direction the probe is sampling from. For probes gathering light from the +x quadrant, for example, the $n$ -th cascade level places probes at positions $p = (x \cdot 2^n, y)$ for integers $x, y$ . The spatial resolution decreases only in the x-direction across cascades, while staying constant in the y-direction for probes in this quadrant. The angular resolution still increases by a factor of 2 per level. This effectively creates a "holographic" grid that aligns its high-resolution axis with the direction of incoming light being sampled.

The angular fluence $R_n(p, i)$ for the $n$ -th cascade at probe $p$ covering angular direction $i$ is computed recursively from the next higher cascade $R_{n+1}$ . This involves splitting the cone into two halves and approximating each half using values from $R_{n+1}$ and the transmittance of the segments. Special handling (interpolation using Eq. 15) is required when the probe's x-coordinate is even to avoid artifacts caused by the differing calculation method for odd x-coordinates (which trace to probes in the next cascade level using Eq. 14). The total fluence at a point is the sum of contributions from four such quadrant-specific cascade systems (for +x, -x, +y, -y).

Acceleration Structure

To efficiently compute the Trace operations required for the recursive definition of $R_n$ , HRC introduces a specialized acceleration structure $T_n(p, k)$ . This structure stores approximations of Trace(p, p + U_n(k)), where $U_n(k)$ defines offset vectors used for ray segments. $T_{n+1}$ values are computed recursively from $T_n$ using merging operations (Eq. 18 for even $k$ ) and blending/interpolation operations (Eq. 20 using Eq. 19 for odd $k$ ), similar to how $R_n$ is computed.

Crucially, this acceleration structure does not rely on empty space skipping. This property is key to HRC's ability to handle detailed, continuous volumetric media efficiently, as performance remains constant regardless of scene content complexity.

Implementation

The core implementation (Algorithm 1) involves two main phases for each of the four quadrants:

Merge Up (Compute T Cascades):
- Initialize the lowest cascade levels ( $T_0, T_1, T_2$ ) by performing traditional ray tracing (Trace) over short segments using an algorithm like DDA. The paper mentions integrating radiance and transmittance analytically within each pixel treated as a uniform volume.
- Compute higher cascades of $T$ ( $T_3$ up to $T_N$ , where $N = \lfloor \log_2(X) \rfloor$ ) recursively from the lower cascades using the merge and blend rules (Eq. 18, 20).
Merge Down (Compute R Cascades):
- Compute the $R_n$ cascades recursively downwards from $n = N-1$ to $n=0$ , using $R_{n+1}$ and the precomputed $T_n$ and $T_{n+1}$ values for Trace operations within the merging rules (Eq. 14, 15). Assume $R_N$ is zero as lights are within bounds.
- $R_0$ provides the angular fluence for the base grid.

After computing $R_0$ for all four quadrants, they are summed to get the total fluence at each base grid point. A final 1-pixel cross blur (Eq. 21) is applied to mitigate checkerboard artifacts that arise because probes with odd and even y-coordinates (in the +x quadrant example) do not directly interact during the cascade merging process.

For multi-bounce GI, the output fluence can be temporally accumulated and fed back into subsequent iterations of the HRC algorithm. However, the standard output $R_0$ only provides low-angular resolution information (4 directions), making it unsuitable for accurate specular reflections. The paper suggests looking up higher cascade $R_n$ values for specular calculations but doesn't detail this.

Practical Implementation Considerations:

Grid Size: The algorithm is defined for a grid of size X x Y, typically powered by 2 for simplicity (e.g., 1024x1024).
Data Storage: Radiance and transmittance values are stored for each probe in each cascade. Using 16-bit floats is recommended for efficiency. The memory footprint is dominated by the $T$ cascades.
Memory Optimization: Cascades can be processed in layers, requiring only two layers of $R$ and all layers of $T$ to be stored simultaneously. Memory for the four quadrants can be reused if the grid is square. Using alternative acceleration structures like BVH for the base Trace calls could reduce memory, but would lose the constant-time performance benefit for volumes.
Parallelism: The algorithm is highly parallel, suitable for GPU implementation. The authors use LuisaCompute (an abstraction over CUDA). Each probe's value can be computed independently within a cascade level.
Scene Representation: The scene requires a representation that allows querying attenuation ( $\sigma_t$ ) and emission ( $L_s$ ) at any point for the initial Trace calls.

Performance

The paper analyzes the theoretical performance for an XxX grid where $X=2^N$ .

Time: O(X² log X). This is derived from computing N cascades of R and N+1 cascades of T, where each cascade computation takes O(X²) time. The initial base Trace calls for $T_0, T_1, T_2$ also contribute but are efficient for short rays. The crucial practical finding is that the runtime is constant for a given grid size, independent of scene complexity (geometry/volumes).
Memory: O(N * X²). Storing all layers of T cascades requires O(N * X²) memory, while R cascades require O(X²) (only two layers needed). This is the main limitation for scaling to 3D.
Ray Count: HRC uses a fixed number of effective "ray intervals" per base probe (52 without the acceleration structure, 19 with the structure as implemented). This replaces variable ray counts in stochastic methods.

Results & Limitations

Experiments on an RTX 3080 Laptop GPU show typical timings of 1.85 ms for 512x512 and 7.67 ms for 1024x1024, achieving significantly lower RMSE than path tracing with an equivalent number of samples (calculated as ray intervals/pixels).

HRC effectively handles diffuse GI, complex occluders (improving penumbra quality over standard RC), light passing through small openings, and dense volumetric media (where path tracing slows down significantly). It is single-shot and noiseless, avoiding the need for denoising or temporal accumulation for a single bounce.

Primary limitations identified:

Artifacts with Small Lights: Checkerboard patterns (partially mitigated by blur) and Moiré-like aliasing appear with light sources smaller than ~8 times the base probe resolution.
3D Scaling: The memory complexity scales as O(N * X³) for an XxXxX grid, making it challenging for large 3D volumes.
Specular GI: The low-angular resolution output (4 directions) is insufficient for sharp specular reflections.

Future work suggested includes temporal caching, acceleration structure compression, jittering rays to reduce aliasing, and combining HRC with a glossy GI method.

PDF Markdown

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Related Papers

Find Related Papers

Authors (3)

Tweets

https://twitter.com/ssh4net/status/1919936057468207549

https://twitter.com/Youssef_Afella/status/1935706734921965645