Neural Six-way Lightmaps
- Neural Six-way Lightmaps are a rendering technique that fuses six directional lightmaps with deep neural networks to achieve real-time interactive volumetric rendering.
- The method employs a fast ray-marched guiding map and a specialized U-Net with channel adapters to efficiently approximate six directional scattering, transparency, and depth.
- Performance benchmarks demonstrate sub-4 ms neural inference and high visual fidelity (up to 41 dB PSNR), enabling seamless integration into dynamic game engine pipelines.
Neural Six-way Lightmaps (NSLM) are a neural rendering technique designed for real-time depiction of participating media—such as smoke and volumetric light—within interactive virtual environments. NSLM fuses the efficiency of the traditional six-way lightmaps approach with deep neural inference to enable full user interactivity, including dynamic scene changes, camera and light movement, and obstacle interaction, at high visual fidelity and performance compatible with existing game engine pipelines. This method addresses the longstanding challenge of balancing computational efficiency, physical plausibility, and adaptability in volumetric rendering (Li et al., 4 Apr 2026).
1. Background and Traditional Six-Way Lightmaps
Rendering participating media requires evaluating the volume rendering equation: where is transmittance, and are absorption and scattering coefficients, and is the in-scattered radiance.
The six-way lightmaps approximation precomputes, per frame and viewpoint, textures encoding transparency and six directional scattering maps for . Rendering then computes, for any light direction , a weighted sum: These methods offer fast blending but are limited to static, pre-simulated sequences and fixed viewpoints, introducing artifacts and precluding interactivity (Li et al., 4 Apr 2026).
2. Guiding Map Computation via Ray Marching
NSLM replaces full Monte Carlo integration with a fast guiding map 0 computed by single-pass, large-step ray marching along the camera direction 1.
This map is derived using three fixed surrogate light directions:
- Front: 2 (viewer-facing)
- Top: 3
- Bottom: 4
The approximate in-scattering is integrated as: 5 Discretization employs large steps (6). The output channels are 7 (three-light in-scatter approximation), 8 (overall transparency), and 9 (depth to surface). This guide is intentionally coarse, reducing computational load and channel count for network inference (Li et al., 4 Apr 2026).
3. Neural Network Architecture and Training
The NSLM network is a 2D image-to-image U-Net augmented with "channel adapters" to specialize outputs. The mapping is: 0 comprising six directional lightmaps, transparency, and an optional emissive channel.
- Backbone: U-Net, with encoder layers reducing to a bottleneck feature map followed by upsampling and skip connections.
- Channel Adapters: The eight output channels are grouped (front/back, left/right, up/down, {alpha, emissive}); each group is mapped with an adapter stack (2-3 NAFBlocks) for disentanglement.
- Activation: GELU or ReLU for all convs; sigmoidal or linear for the terminal layer.
Training details:
- Data generated via Lattice Boltzmann Method (LBM) simulations for 14 smoke+obstacle scenarios; 200 frames, 9 camera angles each (125,200 frames).
- Ground-truth: Houdini Karma volume path tracer (512², 512 spp, single scatter, Henyey–Greenstein 2).
- Loss: weighted combination of MSE (sRGB), VGG-perceptual, and FlowNet-based temporal consistency:
3
with typical weights 1.0, 0.1, 0.5, respectively.
- Optimization via Adam (4), learning rate 5, batch size 12, 6200 epochs, 760 GPU hours, minimal regularization (Li et al., 4 Apr 2026).
4. Pipeline Integration and Real-Time Shading
At inference, 8 generates two RGBA 512² textures:
- Tex0: (R,G,B,A) = (9, 0, 1, transparency)
- Tex1: (R,G,B,A) = (2, 3, 4, emissive)
Runtime shading per fragment:
- Sample the six directional channels.
- For incident light 5, compute 6 as in the six-way formula.
- Compute screen-space shadowing by comparing the smoke depth 7 with the light-view shadow map. Occluded samples have their 8 zeroed.
- Final color 9 0 1 2·emissive 3 4·background.
No geometric changes are required; standard billboard assets suffice, with updated textures and shader logic. This enables seamless integration into established pipelines (Li et al., 4 Apr 2026).
5. Performance Benchmarks and Comparative Analysis
Experiments were conducted on an AMD Threadripper 3970X, NVIDIA GPU (16,384 cores, 24GB) at 512² resolution. Key timings per frame:
| Stage | Time (ms) |
|---|---|
| Fluid Simulation (100³ vel) | 4.0 |
| Smoke Advection (400³ dens.) | 11.5 |
| Guiding Ray March (512²) | 2.0 |
| Network Inference | 1.2 |
| Shadow Test | 0.8 |
| Shader (blending) | 0.3 |
| Total (sim+render) | 19.8 |
Comparative rendering metrics:
| Method | PSNR↑ | MSE↓ | Time (ms) |
|---|---|---|---|
| Reference Path-Tracer | — | — | ~20,000 |
| ReSTIR (1 spp) | 26.3 | 0.00109 | 10.4 |
| ReSTIR + denoiser | 34.8 | 0.00032 | 12.6 |
| MRPNN (3D neural) | 36.1 | 0.00030 | 4 + 180* |
| Neural 6-Way LM | 40.7 | 0.00008 | 3.9 |
*MRPNN: per-frame 3D feature precompute ≈ 180ms.
NSLM achieves significantly higher PSNR (38-41 dB on held-out sequences), lowest MSE, and sub-4 ms neural inference. The total rendering pipeline (simulation + rendering) remains under 20 ms per frame, supporting real-time workflows (Li et al., 4 Apr 2026).
Memory requirements are low: two RGBA textures (8 MB, uncompressed) and network weights (515 MB).
6. Qualitative Outcomes and Applications
Visual and perceptual benchmarks demonstrate:
- High temporal coherence (negligible flicker due to 6 loss).
- Robustness over a 70.5–82.0 range of density scaling (PSNR 9 30 dB).
- Real-time support for dynamic scenarios, including smoke-geometry interactions, explosions with emissive components, and unrestricted camera or lighting changes in environments such as Unreal Engine.
Comparisons reveal that NSLM avoids the traditional billboard artifacts, surpasses ReSTIR in denoised quality, and outperforms volumetric 3D neural renderers in both fidelity and speed. A noted limitation is that screen-space shadowing, while efficient, can fail with complex internal occlusions (Li et al., 4 Apr 2026).
7. Strengths, Limitations, and Prospects
Strengths:
- Full interactivity for smoke, obstacles, lighting, and camera movement.
- Real-time inference (04 ms network, 120 ms end-to-end sim+render).
- High visual fidelity, capturing clear multiple-scatter effects.
- Plug-and-play compatibility with game engines.
Current limitations include reliance on screen-space approximations for shadowing (leading to artifacts for intricate occlusions), potential quality degradation on out-of-distribution density fields or highly complex geometries, and effective capture of only single-bounce scattering.
Potential future advances involve learning more physically accurate volumetric shadowing (e.g., via screen-space virtual point lights), augmenting guiding maps with local gradients or additional cues for higher-order scattering, adaptive ray-marching for detail-sensitive regions, and direct inclusion of scene geometry in network conditioning for improved obstacle interactions. Further generalization to heterogeneous or Brownian media is also anticipated (Li et al., 4 Apr 2026).