SmokeGS-R: Physics-Guided Pseudo-Clean 3DGS for Real-World Multi-View Smoke Restoration

Published 7 Apr 2026 in cs.CV | (2604.05301v1)

Abstract: Real-world smoke simultaneously attenuates scene radiance, adds airlight, and destabilizes multi-view appearance consistency, making robust 3D reconstruction particularly difficult. We present \textbf{SmokeGS-R}, a practical pipeline developed for the NTIRE 2026 3D Restoration and Reconstruction Track 2 challenge. The key idea is to decouple geometry recovery from appearance correction: we generate physics-guided pseudo-clean supervision with a refined dark channel prior and guided filtering, train a sharp clean-only 3D Gaussian Splatting source model, and then harmonize its renderings with a donor ensemble using geometric-mean reference aggregation, LAB-space Reinhard transfer, and light Gaussian smoothing. On the official challenge testing leaderboard, the final submission achieved \mbox{PSNR $=15.217$} and \mbox{SSIM $=0.666$}. After the public release of RealX3D, we re-evaluated the same frozen result on the seven released challenge scenes without retraining and obtained \mbox{PSNR $=15.209$}, \mbox{SSIM $=0.644$}, and \mbox{LPIPS $=0.551$}, outperforming the strongest official baseline average on the same scenes by $+3.68$ dB PSNR. These results suggest that a geometry-first reconstruction strategy combined with stable post-render appearance harmonization is an effective recipe for real-world multi-view smoke restoration. The code is available at https://github.com/windrise/3drr_Track2_SmokeGS-R.

Abstract PDF Upgrade to Chat

Authors (2)

Summary

The paper presents a physics-guided pseudo-clean 3DGS approach that decouples geometry recovery from appearance harmonization for superior smoke removal.
It employs a refined Dark Channel Prior and guided filtering to generate pseudo-clean targets that ensure accurate geometry with sharp detail.
Empirical results on the RealX3D benchmark show a PSNR improvement of +3.68 dB, along with enhanced structural fidelity and reduced smoke-induced artifacts.

Detailed Technical Essay on "SmokeGS-R: Physics-Guided Pseudo-Clean 3DGS for Real-World Multi-View Smoke Restoration" (2604.05301)

Problem Formulation and Motivation

Smoke fundamentally degrades multi-view scene capture through attenuation of radiance, additive airlight, and spatially inconsistent, depth-varying view-dependent perturbations. Existing neural rendering pipelines—including those equipped with explicit scattering models—exhibit significant limitations on real-world smoke scenes, as evidenced by the RealX3D benchmark. The primary failure mode arises from an entanglement of geometry inference and medium appearance, where methods either overfit to smoke appearance at the expense of geometric fidelity or, conversely, recover structure but inherit strong color or contrast biases.

The NTIRE 2026 3D Restoration and Reconstruction Track 2 challenge explicitly targets this task: removing smoke from multi-view observations for high-fidelity, sharp, and geometrically robust novel view synthesis under physically plausible conditions.

Methodological Framework

The key insight of SmokeGS-R is a strict decoupling of geometry recovery from appearance harmonization, leveraging physics-based pseudo-clean supervision, a geometry-first 3DGS source branch, and controlled LAB-space harmonization from a donor ensemble.

Figure 1: Overview of SmokeGS-R with DCP-based pseudo-clean supervision, a 3DGS geometry branch, reference aggregation, and LAB harmonization finalized by Gaussian smoothing.

Physics-Guided Pseudo-Clean Generation

Pseudo-clean targets are synthesized from smoky inputs via a refined Dark Channel Prior (DCP) and guided filtering pipeline. DCP estimates transmission and global airlight from local RGB minimums, enhancing initial transmission maps. Guided filtering refines these maps, ensuring spatial smoothness and edge preservation. Pseudo-clean images are reconstructed by atmospheric inversion and gamma enhancement for improved contrast. Importantly, these serve only as robust geometric anchors, not as absolute ground-truth.

Geometry-First 3DGS Source Model

A clean-only 3D Gaussian Splatting (3DGS) source model is trained exclusively on the refined pseudo-clean images, with photometric supervision combining L1 and SSIM losses and purposefully excluding heavy appearance-oriented regularization or adversarial priors. This minimizes artifact propagation from smoky rendering and ensures accurate geometry and structure recovery.

Donor Ensemble for Appearance Priors

A parallel set of four donor branches is trained, each specializing in different priors (ensemble-spatial, dual-depth, VGGT-based, etc.), constructed to act solely as appearance statistics pools—never influencing geometry. Their diversity captures a broad range of appearance statistics under varying smoke densities and chromatic biases, supplying robust appearance references without introducing geometric artifacts.

LAB-Space Multi-Reference Harmonization

At inference, the geometry-first source is rendered and then harmonized by aggregating donor outputs through geometric-mean reference construction, operating per-pixel in log space for robust amalgamation. LAB-space Reinhard transfer is then performed: for each channel (L, a, b), means and standard deviations of the source and ensemble reference are matched, effectively shifting the color distribution to remove smoke-induced bias while preserving high-frequency structure. Final outputs are smoothly post-processed with low-variance Gaussian blurring to suppress residual splatting artifacts.

Empirical Results and Analysis

On the RealX3D smoke benchmark, SmokeGS-R delivers a PSNR of 15.209, SSIM of 0.644, and LPIPS of 0.551 on publicly released test scenes—substantially outperforming the best official baseline (plain 3DGS at 11.530 PSNR) by +3.68 dB PSNR. The method also consistently surpasses physically-motivated neural rendering pipelines such as SeaThru-NeRF, Watersplatting, and I2-NeRF, all of which demonstrate strongly scene-dependent and often suboptimal performance on real, as opposed to synthetic, smoke.

Figure 2: Scene-wise PSNR disaggregated across released challenge scenes, showing robust improvement by SmokeGS-R over baselines.

Qualitative evaluations (Figure 3) further support the numerical outcomes. When compared with the strongest official baselines, SmokeGS-R maintains sharper object contours, more faithful backgrounds, and substantially greater removal of residual veiling and airlight, especially in scenes with high smoke density. Failures in the baselines manifest as either oversmoothed detail or persistent color cast and haze, which SmokeGS-R mitigates through its separate harmonization step.

Figure 3: Qualitative comparison of rendered test views among reference, 3DGS, SeaThru-NeRF, and SmokeGS-R; SmokeGS-R demonstrates superior veil removal and geometry preservation.

Design Rationale and Ablation Insights

Empirical challenge evidence showed that further entangling physics-based smoke modeling (e.g., via explicit internal radiance field decomposition) decreased pipeline stability, leading to distorted geometry, hallucinated regions, or unreliable appearance transfer. The pipeline’s division into strong physics-based geometry supervision and decoupled appearance correction enabled robust transfer to public test data with no retraining and high reproducibility. The spatially-varying, non-uniform nature of real smoke highlighted by RealX3D makes such decoupling particularly advantageous versus monolithic, learn-everything approaches.

Theoretical and Practical Implications

The work rigorously demonstrates that—contrary to many recent trends—physically detailed internal models are not uniformly beneficial in real-world participating media reconstruction. Instead, careful modularization, with domain priors guiding geometry and appearance correction conducted post hoc, can deliver stronger, more stable results. The method's strong generalization to public benchmarks suggests that a geometry-first paradigm with controlled test-time harmonization is highly practical for real-world deployment, where retraining and heavy model engineering is infeasible.

Future Research Directions

Future extensions may explore:

Adaptive harmonization: Scene-aware or patch-wise harmonization instead of global LAB transfer, potentially using local statistics or learned blending;
Generalization to complex participating media: Application and re-benchmarking on underwater, fog, and other multi-modal settings;
End-to-end differentiable harmonization: Incorporation of differentiable statistics or cross-modal consistency losses into harmonization for further performance gains;
Scalability and real-time deployment: Acceleration, quantization, and optimization for real-world robotic and immersive capture applications.

Conclusion

SmokeGS-R establishes that geometry-first, physics-guided supervision coupled with modular, stable appearance harmonization constitutes an effective and reproducible pipeline for real-world multi-view smoke restoration. Its strong outperformance over the best official RealX3D baselines and successful cross-protocol reproducibility underline the value of strategic decoupling over monolithic modeling in challenging, real-world degradation settings. The approach sets a clear benchmark for future advances, especially concerning modularity, scene-adaptivity, and practical deployability.

Markdown Report Issue