Papers
Topics
Authors
Recent
Search
2000 character limit reached

RT-Splatting: Joint Reflection-Transmission Modeling with Gaussian Splatting

Published 18 May 2026 in cs.CV | (2605.18263v1)

Abstract: 3D Gaussian Splatting (3DGS) enables real-time novel view synthesis with high visual quality. However, existing methods struggle with semi-transparent specular surfaces that exhibit both complex reflections and clear transmission, often producing blurry reflections or overly occluded transmission. To address this, we present RT-Splatting, a framework that disentangles each Gaussian's geometric occupancy from its optical opacity. This factorization yields a unified surface-volume scene representation with a single set of Gaussian primitives. Our hybrid renderer interprets this representation both as a surface to capture high-frequency reflections and as a volume to preserve clear transmission. To mitigate the ambiguity in jointly optimizing reflection and transmission, we introduce Specular-Aware Gradient Gating, which suppresses misleading gradients from highly specular regions into the transmission branch, effectively reducing distracting floaters. Experiments on challenging semi-transparent scenes show that RT-Splatting achieves state-of-the-art performance, delivering high-fidelity reflections and clear transmission with real-time rendering. Moreover, our factorization naturally enables flexible scene editing. The project page is available at https://sjj118.github.io/RT-Splatting.

Summary

  • The paper introduces RT-Splatting, achieving real-time modeling of semi-transparent surfaces with accurate reflection and transmission.
  • It utilizes occupancy-opacity factorization to decouple geometric presence from optical properties, enhancing fidelity in photorealistic renders.
  • RT-Splatting outperforms existing benchmarks like Ref-GS, with a PSNR of 39.77 in transparent regions while maintaining real-time performance.

RT-Splatting: Unified Reflection-Transmission Modeling with Gaussian Splatting

Motivation and Problem Statement

The challenge of photorealistic view synthesis in scenes containing thin semi-transparent surfacesโ€”such as glass or plasticโ€”lies in simultaneously modeling both high-frequency specular reflections and clear background transmission. Traditional 3D Gaussian Splatting (3DGS) and its extensions yield real-time, high-fidelity results for opaque and reflective surfaces, but they conflate geometric presence with optical opacity, resulting in artifacts such as blurry reflections and occluded transmissions in semi-transparent regions. Existing solutions that treat reflection and transmission separately (e.g., multi-stage pipelines, planar assumptions) suffer from limited applicability or poor generalization in complex scenes.

Methodological Innovations

RT-Splatting introduces a unified approach for real-time, high-fidelity modeling of scenes with thin semi-transparent surfaces, by factorizing each Gaussianโ€™s contribution into geometric occupancy and optical opacity. This factorization decouples the surface participation of a primitive (necessary for reflection) from its light attenuation properties (necessary for transmission), enabling explicit, physically grounded handling of both modalities with a unified set of primitives.

Occupancy-Opacity Factorization

Each Gaussian primitive is defined by:

  • Geometric occupancy (oo): Encodes the probability of surface intersection for a given ray.
  • Optical opacity (ฮฑ\alpha): Defines the conditional probability of light absorption or scattering upon intersection.

The effective opacity for volumetric compositing is their product, oฮฑo\alpha, preserving background visibility through transparent surfaces while supporting the compositional needs of deferred shading for specular highlights.

Hybrid Surface-Volume Rendering Architecture

The pipeline comprises two synergistic branches:

  • Deferred Reflection: Aggregates first-hit surface attributes into G-buffers using the probabilistic surface extraction enabled by the occupancy factorization, followed by a learned specular shading network to recover view-dependent reflections.
  • Volumetric Transmission: Accumulates radiance from background and within-material scattering using the effective opacity, supporting accurate modeling of both clear and colored transmission.

Additionally, learnable attributes such as material roughness, scattered color, and transmissivity are incorporated to support realistic glass or plastic materials, and a learnable attenuation parameter models the masking of background details by strong reflections, offering intuitive perceptual control over blending.

Specular-Aware Gradient Gating

To address ambiguous optimization at pixels dominated by high-frequency specular highlights, which often induce spurious gradients and floating artifacts in the transmission pathway, RT-Splatting introduces a specular-aware gradient gating mechanism:

  • Local variance of the specular component is used to detect regions of high reflection complexity.
  • Gating weights modulate gradients flowing into the transmission branch during backpropagation, thereby suppressing erroneous supervision and reducing hallucinated volumetric artifacts behind reflective surfaces.

This design leads to substantially improved decomposition, as confirmed in ablation and qualitative results.

Optimization and Regularization

A transparent mask loss, derived from a semantic segmentation prior, regularizes optical opacity to eliminate โ€œghostโ€ geometries that would otherwise be unconstrained. All scene, material, and shading parameters are optimized jointly, unlike prior segment-and-stitch approaches, making the system applicable to arbitrary complex scenes where backgrounds may only be visible through transparency.

Experimental Results

RT-Splatting delivers state-of-the-art performance across standard public datasets (Ref-Real, NeRF-Casting, EnvGS, T&T) and custom-captured scenes, especially in metrics over transparent regions:

Method PSNR (transp.) SSIM (transp.) LPIPS (transp.) FPS Training Time
3DGS-DR [43] 37.89 0.990 0.012 119.6 0.8h
Ref-GS [50] 37.76 0.989 0.013 38.4 0.8h
EnvGS [41] 37.95 0.990 0.012 18.3 2.9h
RT-Splatting (Ours) 39.77 0.992 0.010 33.3 0.9h

Notably, RT-Splatting outperforms all baselines in both fidelity and perceptual quality in transparent regions, while maintaining real-time performance. The improvement is especially pronounced on challenging views where only transmitted backgrounds are visible through semi-transparent surfaces.

Qualitatively, RT-Splatting preserves sharp reflections and clear, correctly colorized background transmission, while prior methods systematically trade off one modality for the other or introduce blending artifacts ("floaters").

Ablation and Analysis

Component ablations confirm the necessity of each innovation:

  • Removal of occupancy-opacity factorization leads to mutually destructive conflicts between reflection and transmission, severely degrading rendering.
  • Omitting joint optimization precludes reconstructing backgrounds only visible through transparency.
  • Disabling specular-aware gradient gating reintroduces artifacts and ambiguous transmission.
  • Excluding material attributes for internal scattering/transmissivity or attenuation reduces realism and perceptual fidelity.

The system is robust to hyperparameter variations, especially in gating strength, and uses efficient density/churn control designed for the decoupled representation.

Applications and Implications

The explicit reflection-transmission decomposition and unified scene representation enable advanced editing capabilities: real-time control over surface roughness, transparency, specular strength, or color tinting can be performed independently, supporting practical use in interactive graphics, AR/VR, and scientific visualization.

Theoretically, RT-Splatting establishes a new paradigm for physically based, unified surface-volume modeling in neural rendering, particularly highlighting the value of disentangling geometry from material appearance in complex environments.

Limitations and Future Directions

The current framework is designed for thin, minimally refractive semi-transparent surfaces. It does not explicitly model refraction or multi-bounce light transport (e.g., in water or solid glass volumes). Extending the occupancy-opacity approach to fully support thick refractive media, and integrating physically-based multi-bounce ray tracing with efficient splatting-based pipelines, represent promising avenues for further research.

Conclusion

RT-Splatting (2605.18263) constitutes a significant step toward unified, real-time rendering of scenes with strongly coupled reflection and transmission within the Gaussian Splatting paradigm. Its architectural and algorithmic advances resolve longstanding ambiguities inherent in semi-transparent surface modeling, with direct benefits to scene reconstruction, editing, and high-fidelity visualization.

Whiteboard

There was an error generating the whiteboard.

Explain it Like I'm 14

Overview

This paper introduces RT-Splatting, a new way to make computer-generated 3D scenes look realistic when you have thin, seeโ€‘through but shiny things in them, like windows or clear plastic. The method lets you see sharp reflections on the surface and also see clearly through the surface at the same time, and it runs fast enough for real-time use.

What questions does the paper ask?

  • How can we render scenes with glass-like surfaces so that reflections look sharp but the background seen through the glass stays clear?
  • How can we avoid common visual mistakes, like blurry reflections or fake โ€œfloatersโ€ (made-up blobs) that appear behind the glass?
  • Can we do all of this quickly (in real time) and with one unified scene representation, instead of stitching multiple models together?

How does it work? (Simple explanation)

First, some background: modern โ€œGaussian Splattingโ€ represents a 3D scene using lots of soft, fuzzy 3D dots (Gaussians). When viewed from a camera, these dots are blended to make an image. This is fast and usually looks great, but it struggles with semi-transparent, shiny surfaces where you need both reflection (whatโ€™s bouncing off the surface) and transmission (what you see through it).

RT-Splatting has three core ideas:

  • Split โ€œbeing thereโ€ from โ€œblocking lightโ€
    • Think of each fuzzy dot as doing two jobs:
    • Geometric occupancy: does the surface exist here so it can reflect light? (Like a windowโ€™s surface being in the way for reflections)
    • Optical opacity: how much does it actually block or absorb light passing through? (Most clear glass blocks very little)
    • By learning these two properties separately, the system can treat glass as a real surface for reflections, while still letting background light pass through.
  • A hybrid two-step render, like โ€œsketch then colorโ€
    • Step 1 (Surface/Reflection pass): The system first figures out where the camera would โ€œtouchโ€ the surface and gathers surface details (like its direction/normal and shininess/roughness). This is similar to sketching the outline and important notes into special image layers called Gโ€‘buffers. It then computes the shiny reflection using a learned shading function (so mirror-like highlights look crisp).
    • Step 2 (Volume/Transmission pass): In parallel, it adds up the light coming from the background behind the glass, making sure the glass doesnโ€™t wrongly block it (thanks to the split between occupancy and opacity).
    • Finally, it mixes the reflection and the seeโ€‘through background. A learned โ€œattenuationโ€ factor dims the seeโ€‘through part more when reflections are strong, matching how we perceive glass in real life.
  • Smarter learning in tricky shiny regions
    • When training the model, shiny spots are hard to get perfect. The remaining mistakes can mislead the โ€œseeโ€‘throughโ€ part, causing it to invent floaters behind the glass.
    • RT-Splatting adds Specular-Aware Gradient Gating, which is like a teacher saying: โ€œIn very shiny, complicated areas, donโ€™t let the seeโ€‘through part overreact to errors.โ€ It measures how complex the reflection is in a small patch; if itโ€™s very complex, it turns down the learning signal for the transmission branch there. This cuts down fake floaters and keeps the background crisp.

Thereโ€™s also a light-touch helper: a transparency mask from a pre-trained segmenter. It gently guides the learning so the system doesnโ€™t create โ€œghostโ€ surfaces that donโ€™t affect the image but would confuse the model.

What did they find?

The authors tested RT-Splatting on real scenes with car windows, plastic films, and other thin transparent surfaces. Compared to other fast methods:

  • Reflections are sharper and more realistic (no smearing).
  • The background seen through glass is clearer (fewer fake floaters and less unwanted blocking).
  • It works in tough situations where the only way to see part of the scene (like a carโ€™s interior) is through glass.
  • It still runs in real time and trains in a reasonable amount of time.
  • Each part of their design matters: removing the occupancyโ€“opacity split, the reflection/transmission mixing, the gating, or the material scattering pieces makes results worse in measurable ways.

A bonus: because the method separates reflection and transmission, you can edit scenes easilyโ€”make glass more or less shiny, change its tint, reduce reflections, or adjust roughnessโ€”without breaking the rest of the image.

Why does it matter?

This work makes it much easier to render everyday scenes with windows, screens, or clear plastics in a realistic and fast way. Thatโ€™s important for:

  • AR/VR and games: believable glass and shiny surfaces at real-time speeds.
  • Film and virtual production: reliable, editable reflections and see-through details without heavy manual tricks.
  • Robotics and autonomous systems: clearer views through windows or screens while still understanding reflective cues.

Limitations and future steps: RT-Splatting focuses on thin, nearly flat transparent surfaces where light mostly goes straight through (like typical window glass). It doesnโ€™t yet handle strong bending of light (refraction) in thicker materials like solid glass sculptures or water, or multiple internal bounces. Future work could extend it to those harder cases.

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Below is a focused list of concrete gaps and open problems that remain unresolved and could guide follow-up research.

Modeling and physical fidelity

  • No refraction or multi-bounce light transport: extend the model to thick refractive media (e.g., glass blocks, water) with ray bending, internal reflections, and multi-bounce effects; quantify performance vs. refractive index and surface curvature.
  • Single first-hit surface assumption: support multiple stacked or coated layers (e.g., double-pane windows, clear coat + paint) via multi-layer G-buffers and multi-hit deferred shading.
  • Heuristic blend instead of Fresnel: the learned attenuation ฮฒ replaces physically based Fresnel blending to accommodate tone mapping; investigate training in linear HDR/RAW space, explicit camera response modeling, or hybrid physics-learned blending to restore physical correctness and energy conservation.
  • Subsurface transport simplification: the mixture Csub = TยทCtrans + (1โˆ’T)ยทCscatter ignores path length, thickness, and angle dependence; incorporate thickness estimates, Beerโ€“Lambert absorption, angle-dependent transmittance, and wavelength-dependent tinting/dispersion.
  • Participating media not modeled: extend the forward pass to heterogeneous volumetric media (fog/smoke) and mixed surfaceโ€“volume scenes with semi-transparent interfaces.
  • Material coherence: Cscatter and T are per-Gaussian and may overfit; add material-space priors, spatial coherence, or shared material embeddings with constraints (e.g., energy conservation, roughness/IOR consistency).

Optimization and learning dynamics

  • Identifiability of occupancyโ€“opacity factorization: product oยทฮฑ is underdetermined; develop mask-free regularizers or learned priors (e.g., sparsity, smoothness across surfels, depth-aware constraints) to avoid โ€œghostโ€ geometries without external masks.
  • Dependence on external masks (SAM2): quantify robustness to mask noise, prompts, and failure cases; explore end-to-end, self-supervised transparent-region discovery to remove reliance on external segmentation.
  • Gradient gating design: gating uses local variance of predicted Cspec over 3ร—3 patches; evaluate alternative complexity signals (image gradients, frequency-domain measures, roughness/normal variance, predictive uncertainty) and schedules; analyze early-training misgating when Cspec is inaccurate.
  • Scope of gating: gradients are gated only for Ctrans; study gating for occupancy/opacity, normals, and material attributes to prevent leakage into other branches; provide convergence analyses and diagnostics for when gating harms background learning.
  • Normal sensitivity: reflections rely on accurate surfel normals; characterize sensitivity and integrate normal priors/supervision or robust normal refinement to mitigate artifacts from noisy surfel orientations.

Scope, generality, and failure modes

  • Thin-surface assumption: determine failure boundaries for curved or moderately refractive thin surfaces (e.g., car windshields); create controlled benchmarks varying curvature/IOR to profile degradation.
  • Near-field reflections: the method uses a learned shading network rather than explicit reflection tracing; investigate lightweight ray-traced or environment-Gaussian hybrids that capture near-field reflection paths while remaining real-time and compatible with transparency.
  • Mirrors and opaque reflectors: study the degenerate case with zero transmission (pure mirrors) and mixed mirrorโ€“glass regions; ensure stable behavior and evaluate against mirror-specific baselines.
  • Dynamic scenes: extend and evaluate on moving reflectors and moving backgrounds behind glass; enforce temporal consistency and examine gating behavior under motion and rolling shutter.
  • Photometric variability: robustness to auto-exposure, white balance, tone mapping, and saturated highlights is unexamined; explore radiometric calibration, exposure-invariant losses, and HDR pipelines.

Evaluation and benchmarking

  • Limited and domain-specific datasets: curate benchmarks with controlled reflection/transmission ground truth (synthetic and real), including thick refractive objects, layered panes, near-field reflectors, and participating media.
  • Metrics for decomposition quality: beyond PSNR/SSIM/LPIPS, define reflection- and transmission-specific metrics (e.g., reflection fidelity vs. a relit reference, transmission sharpness/contrast behind glass, leakage/bleed-through measures).
  • Baseline coverage: include transparent/refractive baselines (e.g., TransparentGS, refractive NeRF variants) in regimes where they apply to contextualize gains and limitations.

Efficiency and system design

  • Scalability and resource use: characterize memory/compute as scene complexity and number/extent of semi-transparent surfaces grow; investigate hierarchical/streamed splats, adaptive multi-res G-buffers, and tile-based deferred shading for higher resolutions.
  • Pruning policy: pruning by occupancy risks removing visually important low-occupancy elements; design saliency-aware or uncertainty-aware pruning to preserve critical transparent structures.
  • Representation portability: assess how occupancyโ€“opacity factorization extends to 3DGS or alternative primitives and whether deferred/forward hybrid rendering remains stable and efficient.

Editing and applications

  • Physically grounded parameter editing: current edits (roughness, transparency, tint, โ€œremove specularโ€) lack guarantees of realism; estimate interpretable BRDF/BSDF parameters (roughness, IOR, absorption coefficients) to enable consistent, physically based edits.
  • Cross-view consistency of layers: quantify and enforce consistency of reflection/transmission decompositions across viewpoints; add cross-view layer-consistency losses or cycle constraints to reduce layer leakage.

Practical Applications

Immediate Applications

Below are deployable use cases that leverage RT-Splattingโ€™s unified surfaceโ€“volume Gaussian representation, hybrid deferredโ€“forward rendering, and specular-aware gradient gating to handle semi-transparent, reflective surfaces in real time.

  • Glass-aware 3D capture and viewing for built environments
    • Sectors: architecture, real estate, cultural heritage (museums, galleries), digital twins
    • What it enables: High-fidelity scans where windows, partitions, and display cases retain sharp reflections while remaining see-through; reliable reconstruction of content visible only through glass (e.g., interiors behind windows)
    • Workflow: Capture ~200โ€“300 calibrated views; train RT-Splatting (~0.9h on a 4090); deploy real-time viewer (WebGL/desktop) with reflection/transmission toggles and material editing (tint/roughness/transparency)
    • Dependencies/assumptions: Thin, semi-transparent surfaces (negligible refraction); static scene; multi-view coverage; GPU for training/inference; SAM2-based mask regularization (optional but recommended)
  • VFX and virtual production: capture-through-glass and compositing
    • Sectors: film/TV, advertising, post-production
    • What it enables: On-set scans with reflective glass (cars, storefronts, office interiors) that preserve reflections without blocking transmission; independent reflection/transmission layers for downstream compositing
    • Tools/products/workflows: RT-Splatting ingest in Blender/Unreal; export reflection-only and transmission-only passes; per-pixel attenuation for art-directable balance; scene editing of tint and roughness
    • Dependencies/assumptions: Multi-view footage; thin glass; tone mapping may necessitate color management; GPU resources
  • Automotive visualization and digital showrooms
    • Sectors: automotive, retail, marketing
    • What it enables: Realistic real-time car scans with readable interiors through windows; configurable tint/roughness; sales configurators that keep believable reflections without hiding interiors
    • Tools/products/workflows: Web 3D viewer with โ€œreflection strengthโ€ and โ€œglass tintโ€ sliders; dealership capture kits
    • Dependencies/assumptions: Static vehicle; multi-view capture; thin glass modeling
  • AR occlusion and realism near glass
    • Sectors: AR/VR, retail, navigation
    • What it enables: Consistent reflections and see-through behavior for AR content placed near windows and glass displays; improved occlusion where transmission should remain visible
    • Tools/products/workflows: Integrations with AR SDKs (ARKit/ARCore/OpenXR); layer-wise blending using RT-Splattingโ€™s reflection/transmission decomposition
    • Dependencies/assumptions: Environment pre-scan; static glass; device-side or edge rendering
  • Robotics data generation in glass-heavy environments
    • Sectors: robotics, warehouse/logistics, service robots
    • What it enables: Realistic training assets for perception in spaces with glass partitions/cabinets; accurate supervision for background geometry visible only through glass without โ€œfloaterโ€ artifacts
    • Tools/products/workflows: Synthetic-to-real pipelines using RT-Splatting reconstructions; generation of reflection-only/transmission-only supervisory signals
    • Dependencies/assumptions: Static capture scene; thin glass; multi-view training set
  • Inspection and monitoring through enclosures
    • Sectors: industrial, pharma/biotech, energy
    • What it enables: Digital twins of equipment behind safety glass or acrylic (control panels, gauges) with legible transmission and truthful reflections
    • Tools/products/workflows: Periodic scans for change detection; reflection attenuation to enhance readability during review
    • Dependencies/assumptions: Thin transparent cover; multi-view access; static or quasi-static targets
  • E-commerce capture of packaged goods
    • Sectors: retail/e-commerce, CPG
    • What it enables: Real-time product viewers for blister packs and clear cases that separate the product (transmission) from protective reflections; adjustable glare for marketing assets
    • Tools/products/workflows: RT-Splatting-based capture kit; web viewer with โ€œremove reflectionsโ€ toggle; batch rendering of reflection-free thumbnails
    • Dependencies/assumptions: Thin packaging; controlled capture; GPU inference for batch pipelines
  • Forensic and security review enhancement
    • Sectors: security, insurance
    • What it enables: Reflection/transmission decomposition from multi-view evidence to reduce glare and reveal content behind glass for analysis; consistent layer export for audit trails
    • Tools/products/workflows: โ€œDeglareโ€ viewer using transmission component; configurable attenuation to preserve evidentiary integrity
    • Dependencies/assumptions: Multi-view recordings; thin glass; ethical/legal compliance; static or re-enactable scenes
  • Photogrammetry through glass for mapping and cultural heritage
    • Sectors: GIS/mapping, heritage digitization
    • What it enables: Robust reconstructions of exhibits and interiors seen only through display cases or windows; fewer manual masks and less cleanup
    • Tools/products/workflows: Replace hand-crafted transparent-object segmentation with mask regularization and gradient gating; publish to web viewers
    • Dependencies/assumptions: Multi-view capture; thin covers; static scenes; GPU training
  • Photography/post-processing of glare
    • Sectors: prosumer photography, media
    • What it enables: From a short handheld capture, export reflection-removed renders of subjects behind glass; retain optional reflection layer for stylization
    • Tools/products/workflows: Mobile/desktop app offering โ€œreflection-freeโ€ and โ€œreflection-onlyโ€ rerenders from a brief sweep of images
    • Dependencies/assumptions: Requires multi-view (not single image); thin glass; device or cloud compute

Long-Term Applications

These applications are feasible with further research and engineering (e.g., modeling refraction/multi-bounce, handling dynamics, mobile deployment, or large-scale operations).

  • Thick refractive media and multi-bounce transport
    • Sectors: underwater inspection, optics, medical imaging, product design
    • What it could enable: Accurate rendering/reconstruction of solid glass objects, water tanks, lenses, and curved acrylic; support for refraction and internal scattering beyond thin-surface approximation
    • Dependencies/assumptions: Extend RT-Splatting to refractive paths and multi-bounce light; stable optimization with added ambiguity
  • Live, on-device AR capture and adaptation
    • Sectors: AR/VR, mobile
    • What it could enable: On-the-fly reconstruction around glass with reflection/transmission handling directly on phones/headsets
    • Dependencies/assumptions: Model compression, hardware acceleration (mobile NPUs/GPUs), fast incremental training/updates
  • Dynamic scenes with changing reflections and moving actors
    • Sectors: events, sports broadcasting, retail
    • What it could enable: Real-time updates as people or lighting move behind/around glass; temporally consistent reflection/transmission layers
    • Dependencies/assumptions: Deformable/dynamic GS or hybrid video radiance fields; temporal priors; streaming training
  • Autonomous driving: interior understanding through windows
    • Sectors: automotive autonomy, ADAS
    • What it could enable: Better scene priors for occupants/objects visible through car windows; improved hazard prediction and intent understanding
    • Dependencies/assumptions: Robustness to motion, weather, and polarization; fusion with LiDAR/radar; safety and privacy compliance
  • Single-image or sparse-view reflection removal via distillation
    • Sectors: consumer imaging, journalism, medical imaging through viewports
    • What it could enable: Train supervised or distilled models from RT-Splatting decompositions to perform deglare from minimal inputs
    • Dependencies/assumptions: Large curated datasets of decomposed pairs; generalization beyond the training capture settings
  • Standardized transparency-aware digital twin pipelines
    • Sectors: AEC/BIM, smart buildings, manufacturing
    • What it could enable: Native support for glass-aware capture/edit/render in CAD/BIM software and facility twins; material-aware editing at scale
    • Dependencies/assumptions: SDKs/APIs for RT-Splatting integration; asset standards for storing reflection/transmission layers and factorized opacities
  • Advanced robotic manipulation of transparent/reflective objects
    • Sectors: logistics, lab automation, household robotics
    • What it could enable: Perception stacks trained with transparency-aware renders and extended physics models for grasping glassware or glossy items
    • Dependencies/assumptions: Incorporate refraction and contact shading; tactile/vision fusion; domain randomization with transparency controls
  • Cloud streaming and edge rendering for large venues
    • Sectors: tourism, retail, entertainment
    • What it could enable: Interactive, glass-heavy venues streamed with accurate reflections/transmission to lightweight clients
    • Dependencies/assumptions: Server-side GPU pools; content delivery for 30โ€“60 FPS; memory- and bandwidth-aware splat representations
  • Governance and ethics for โ€œsee-throughโ€ reconstructions
    • Sectors: policy, compliance, privacy
    • What it could enable: Guidelines for scanning private interiors visible through windows; watermarking and disclosure when reflection/transmission are manipulated
    • Dependencies/assumptions: Legal frameworks; provenance tooling; user-consent capture workflows

Notes on Feasibility, Assumptions, and Dependencies (cross-cutting)

  • Thin-surface approximation: Current method assumes semi-transparent thin surfaces with negligible refraction; not suitable for thick glass, water, or complex internal optics without extensions.
  • Data requirements: Multi-view calibrated images; scenes should be mostly static during capture. Background exclusively visible through glass is supported.
  • Compute: Training reported ~0.9h on an RTX 4090; real-time rendering ~33 FPS on desktop-class GPUs. Mobile/edge requires optimization.
  • Stability aids: Specular-aware gradient gating reduces floaters; SAM2 masks used only as regularization (not hard segmentation).
  • Integration: Implemented in PyTorch atop 2DGS; deferred shading pipeline; exportable reflection/transmission layers and material controls (roughness, tint, transmissivity).
  • Photometric considerations: Nonlinear camera pipelines (tone-mapping) can affect physically based blends; the learned attenuation term helps match perceptual suppression of transmission under strong highlights.
  • Legal/ethical: Applications that โ€œsee throughโ€ glass (e.g., interiors) must follow privacy laws and consent protocols.

Glossary

  • 2D Gaussian Splatting (2DGS): A surface-aligned scene representation using 2D Gaussian โ€œsurfelsโ€ for accurate geometry and real-time rendering. "2DGS models the scene as a set of 2D Gaussian surfels embedded in 3D space."
  • 3D Gaussian Splatting (3DGS): A real-time radiance field method that represents scenes with 3D Gaussian primitives and renders them via rasterization. "3D Gaussian Splatting (3DGS) [18] has revolutionized the field of novel view synthesis"
  • alpha blending: A compositing technique that accumulates colors along a ray using per-primitive opacities in front-to-back order. "The final color C for a pixel is then computed by alpha blending the Gaussians in front-to-back order"
  • anisotropic 3D Gaussian: A Gaussian with a full covariance (direction-dependent spread) used to model oriented, elongated primitives. "a collec- tion of anisotropic 3D Gaussian primitives"
  • attenuation factor: A learned scalar that modulates transmitted/subsurface light based on reflection strength to match perceptual suppression. "output an attenuation factor 3 โ‚ฌ [0,1] that directly modulates the subsurface-transport component."
  • backpropagation: The gradient-based procedure for training parameters by propagating losses through the rendering pipeline. "During backpropagation, gradients induced by these residuals can be erroneously routed into the transmis- sion branch"
  • binary cross-entropy (BCE): A loss function for supervising binary predictions such as masks or opacities. "We then supervise this opacity map with a binary cross-entropy (BCE) loss"
  • cone tracing: A rendering approximation that traces cones instead of rays to aggregate reflected features over a region. "NeRF-Casting [36] performs cone tracing along reflection paths"
  • deferred shading: A two-pass rendering pipeline that first writes surface attributes to G-buffers and then shades per pixel. "Deferred shading is a two-pass rendering technique that decouples geometry processing from lighting and material computations."
  • effective opacity: The product of geometric occupancy and optical opacity used for volumetric compositing. "defines the effective opacity used for volumetric composit- ing"
  • environment map: An image-based representation of distant illumination used for efficient specular shading. "shades with a learnable environment map"
  • first-surface extraction: Identifying the first surface hit along a ray to aggregate correct per-pixel attributes. "our factorization naturally yields a probabilistic formu- lation for first-surface extraction"
  • floaters: Spurious, behind-surface artifacts introduced during optimization that appear as floating geometry. "halluci- nates 'floaters' behind the surface."
  • Fresnel equations: Physical laws describing angle-dependent reflection/transmission at interfaces; used here as a reference for blending. "A purely physics-based blend using Fresnel equations is often broken in practice"
  • G-buffer: Per-pixel buffers storing geometry/material attributes (e.g., normals, albedo) for deferred shading. "collectively called G-buffers."
  • geometric occupancy: The learned probability that a ray interacts with a Gaussian as a surface element. "The geometric occupancy o โ‚ฌ [0,1] encodes the probability that a ray interacts with the substance of the Gaussian."
  • LPIPS: A learned perceptual image similarity metric used for evaluating reconstruction quality. "We report PSNR, SSIM [39], and LPIPS [48]"
  • normal consistency loss: A regularizer that aligns rendered normals with depth-derived gradients to stabilize geometry. "we minimize the normal consistency loss En to enforce geometric alignment"
  • occupancy-opacity factorization: Splitting a Gaussianโ€™s role into surface presence (occupancy) and true light attenuation (optical opacity). "Our occupancy- opacity factorization introduces a specific ambiguity"
  • optical opacity: The conditional probability that light is absorbed or scattered once a surface interaction occurs. "The optical opacity a โ‚ฌ [0, 1] then specifies the conditional probability that the ray is absorbed or scattered once such an interaction occurs."
  • rasterization: Projecting and accumulating Gaussian primitives efficiently onto the image plane during rendering. "via rasterization."
  • rendering equation: The integral equation governing light transport used in physically based shading. "explicitly evaluates the render- ing equation"
  • Spherical Harmonics (SH): A basis for compactly representing view-dependent color on Gaussian primitives. "color represented by Spherical Harmonics (SH)."
  • Specular-Aware Gradient Gating: A training mechanism that down-weights transmission gradients in highly specular regions to prevent floaters. "we introduce Specular- Aware Gradient Gating"
  • stop-gradient: An operator that blocks gradient flow through a tensor during backpropagation to control optimization paths. "Let sg(ยท) denote the stop- gradient operator."
  • subsurface transport: The combined transmitted and internally scattered light component within the material. "into a subsurface- transport component"
  • surfel: A surface element (disk-like primitive) representing local surface geometry in 2DGS. "2D Gaussian surfels embedded in 3D space."
  • tone-mapping: Nonlinear camera or display mapping that affects physically based blending assumptions. "tone-mapping and other nonlinear camera responses"
  • transmissivity: A material property controlling the proportion of light transmitted through a surface. "T dictates the mate- rial's transmissivity"
  • volumetric rendering: Accumulating radiance and opacity along rays through a volume to form pixel colors. "like stan- dard volumetric rendering"

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 4 tweets with 546 likes about this paper.