Weighted Sum Rendering (WSR) Overview
- Weighted Sum Rendering (WSR) is a compositing technique that replaces traditional non-commutative alpha blending with a commutative weighted sum to enable real-time rendering.
- It introduces depth-dependent weights to eliminate the need for per-pixel splat sorting, thereby removing temporal popping artifacts and improving rendering efficiency.
- Extensions like Duplex-GS leverage hierarchical proxy cells and cell-level transmittance to accurately handle occlusion while further boosting performance on mobile and desktop devices.
Weighted Sum Rendering (WSR) defines a family of sort-free, order-independent compositing algorithms for 3D Gaussian Splatting (3DGS), enabling real-time, high-fidelity view synthesis by replacing traditional, non-commutative alpha blending with commutative weighted sums. WSR achieves this by introducing depth- or transmittance-dependent weights into the blending process, eliminating the need for costly per-pixel splat sorting and associated "popping" artifacts. Modern frameworks such as Duplex-GS generalize WSR with proxy-cell hierarchies and physically-motivated weights, providing both efficiency and correct occlusion handling for photorealistic scene rendering on desktops and resource-constrained devices (Liu et al., 5 Aug 2025, Hou et al., 2024).
1. Mathematical Formulation and Derivation
WSR originates as a relaxation of classic front-to-back alpha blending, which requires strict depth sorting because the OVER operator is non-commutative. Given 3D Gaussian splats, the standard approach computes opacity and composited color per-pixel as:
Here is the opacity of splat , is its projected kernel at pixel , is accumulated transmittance, and its color. Sorting overhead scales with the number of contributing Gaussians per-ray and compromises temporal coherence, causing popping artifacts.
WSR discards the non-commutative OVER, replacing it with a commutative weighted sum. The color is expressed as:
where , denote background color/weight, and is a depth-dependent blend weight for splat at depth . Both numerator and denominator are purely accumulative sums, enabling arbitrary compositing order with perfect temporal stability.
Notable instantiations of :
- Direct: (DIR-WSR).
- Exponential: with learned (EXP-WSR).
- Linear-Corrected: for learned and per-splat (LC-WSR).
WSR admits extensions such as the physically-inspired kernel of Duplex-GS, which introduces proxy-cell groups and cell-level transmittance to re-enable early ray termination and restore monotonic occlusion (Liu et al., 5 Aug 2025).
2. Order-Independence and Differentiability
Additivity in both numerator and denominator of the WSR equation guarantees commutativity and thus order-independence; any per-pixel splat accumulation order yields the same result. Temporal popping is eliminated, as even small changes in active splat lists do not yield visible blend discontinuities across frames.
WSR is fully differentiable. Its parameter set includes Gaussian means, covariances, opacity parameters, color bases (often SH), cell-proxy weights, and weight-function learnables (). In practice, end-to-end learning is performed using a sum of pixel error and multi-scale DSSIM losses, optimizing global and local parameters jointly via custom CUDA or Vulkan kernels (Hou et al., 2024).
3. Hierarchical Proxies and Cell-Based WSR
Duplex-GS generalizes WSR by grouping Gaussians into proxy "cells," each defined by a bounding ellipsoid and carrying a feature vector for decoding Gaussians inside. Cell-level sorting replaces per-splat sorting: only the cells are depth-sorted, and within each cell the Gaussians are composited without internal ordering. Transmittance per-cell is computed in front-to-back order, and early termination is triggered when accumulated transmittance drops below a small threshold .
Color compositing in the cell-proxy extension is:
with for cell scalar and sorted cell transmittance (Liu et al., 5 Aug 2025).
Cell search rasterization is employed: only visible proxy cells are rasterized, reducing memory and sorting overhead by 50–90%. Each visible cell is dynamically decoded, and all contained Gaussians contribute only as needed, enabling efficient real-time performance.
4. Algorithmic Implementation and Mobile Pipelines
On the GPU, WSR is implemented to exploit order-independence for both desktop and mobile hardware. A typical mobile pipeline proceeds as follows (Hou et al., 2024):
- Project 3D Gaussians or proxy cells to screen space.
- Evaluate per-splat color and opacity (often in a vertex or compute shader).
- Omit sorting passes; pass view depth as an attribute.
- In the fragment shader, accumulate per-fragment color and weight:
1 2
accumColor += α_i w(d_i) c_i accumWeight += α_i w(d_i)
- After rasterization, normalize final color per-pixel as .
- For cell-based WSR, proxy cells are sorted, and early ray termination is supported via per-cell transmittance tracking (Liu et al., 5 Aug 2025).
This structure leverages built-in additive blending and supports single instanced draw calls, yielding high throughput. On Snapdragon 8 Gen 3, WSR achieves render speedup and 63% memory footprint compared to sort-based 3DGS at (Hou et al., 2024).
5. Artifact Elimination and Physical Limitations
Classic alpha blending incurs "popping" artifacts (discrete color jumps) with minor splat order changes due to sorting non-commutativity. WSR, being a commutative sum, is temporally stable and immune to popping. However, vanilla WSR variants (including LC-WSR) lack physical early ray termination: all splats contribute regardless of occlusion, which can lead to transparency artifacts—background "bleeds" through dense or opaque regions.
The proxy-cell and transmittance-controlled WSR in Duplex-GS addresses this by:
- Introducing cell-level transmittance and enabling early termination at the proxy-cell level.
- Guaranteeing monotonic occlusion such that rays do not traverse through fully opaque foreground, eliminating "see-through" ghosts (Liu et al., 5 Aug 2025).
This distinction is pivotal for achieving both realism and computational efficiency in large, complex scenes.
6. Quantitative Performance and Limitations
Empirical evaluations on datasets such as Mip-NeRF360, Tanks&Temples, DeepBlending, and BungeeNeRF demonstrate:
| Dataset | FPS (Duplex-GS) | FPS (LC-WSR) | Speedup | Sort Reduction (%) | Memory Reduction (%) |
|---|---|---|---|---|---|
| Mip-NeRF360 | 184 | 77 | 2.4× | 54% | 53% |
| Tanks&Temples | 147 | 89 | 1.65× | ||
| DeepBlending | 232 | 114 | 2.0× | ||
| BungeeNeRF | 124 | 31 | 4.0× | 87% | 87% |
LC-WSR (WSR with linear depth correction) achieves competitive or better quality relative to classic 3DGS by PSNR, SSIM, and LPIPS metrics, with exemplary temporal stability and sharp edge reproduction (Hou et al., 2024). Proxy-cell WSR maintains or improves perceptual quality while reducing radix-sort overhead by 52–87% and achieving O(1)-sorting complexity for most rays (Liu et al., 5 Aug 2025).
Limitations include reliance on learned weight functions to mimic physical occlusion and the lack of exact physical correctness in the vanilla (non-proxy) WSR approach. Although per-fragment weight computation introduces some overhead, the aggregate throughput remains higher than traditional sort-dependent methods, especially on parallel hardware.
7. Significance and Prospective Developments
Weighted Sum Rendering constitutes a paradigm shift in differentiable graphics and 3D neural scene representations. By enabling sort-free, temporally coherent, and hardware-efficient view synthesis, WSR unlocks interactive performance for scalable 3DGS applications on both high-end and mobile devices.
The introduction of hierarchical proxy-based compositing (as in Duplex-GS) further bridges the trade-off between physical fidelity and rendering throughput, suggesting a generalizable methodology for future real-time neural rendering and graphics systems. Ongoing development focuses on enhancing physically-accurate occlusion, supporting adaptive cell layouts, and integrating more expressive learned weighting schemes (Liu et al., 5 Aug 2025, Hou et al., 2024).