SurgiATM: Surgical Smoke Removal Technique
- SurgiATM is a physics-guided module that combines atmospheric scattering models with deep neural architectures for effective surgical smoke removal.
- It leverages the dark channel prior and a mixture-of-experts method to improve accuracy, stability, and generalizability without adding extra trainable parameters.
- The approach uses only two hyperparameters and minimal computational overhead, enabling plug-and-play integration for real-time intraoperative applications.
The Surgical Atmospheric Model (SurgiATM) is a physics-guided, parameter-efficient module for surgical smoke removal that bridges the strengths of physics-based scattering models and deep neural architectures. Designed as a lightweight, plug-and-play wrapper, SurgiATM enhances the accuracy, stability, and generalizability of existing deep learning–based desmoking methods without introducing extra trainable weights or significant computational overhead. SurgiATM leverages the dark channel prior from classic dehazing, reframes model outputs via a statistically principled mixture-of-experts approach, and operates with only two hyperparameters, thereby facilitating real-time intraoperative deployment and broad compatibility across network architectures and surgical procedures (Sheng et al., 7 Nov 2025).
1. Theoretical Foundation: Physics-Based Atmospheric Scattering
SurgiATM is grounded in the classic Atmospheric Scattering Model (ASM), which analytically characterizes the formation of smoky or hazy images as a convex combination of direct scene radiance attenuated by the medium and an additive atmospheric veil. The image at pixel and color channel is modeled as:
where is true scene radiance, is the transmission determined by scattering coefficient and scene depth , and is the smoke veil intensity. The normalized radiance is such that .
To estimate , SurgiATM employs the Dark Channel Prior (DCP):
with , and scene restoration via:
2. Mixture-of-Experts: Statistical Bridge Between Physics and Deep Learning
Rather than relying exclusively on the DCP or a deep learning (DNN) prediction , SurgiATM treats both as "experts" in a mixture-of-experts (MoE) framework. Each predictor's error (denoted , ) is modeled as a zero-mean Laplacian, conditioned on . An optimal pixelwise gate is derived by minimizing the variance and squared bias of the mixture error:
with the closed-form optimal gate:
where , are mean and scale parameters for DCP and DNN, respectively.
Empirically, is strongly anti-correlated with across architectures and datasets, motivating the practical approximation:
This removes the need for explicit veil estimation and simplifies computation.
3. Restoration Formula and Plug-in Integration
Substituting the approximated gate and leveraging yields the SurgiATM restoration formula:
which can be re-expressed as:
Defining the "denormalized dark channel" , computable by a single min-filter operation, the update becomes:
In practice, given a pretrained or trainable desmoking network that estimates (typically via sigmoid activation), SurgiATM is appended as a post-processing arithmetic layer. The loss function becomes:
No extra trainable weights or modifications to network topology are introduced, aside from the minimal addition of computation and a pointwise multiply-subtract operation. This enables negligible latency overhead even in real-time clinical settings.
4. Hyperparameters: Window Size and Smoothing
SurgiATM introduces two hyperparameters:
- Window size : Governs the local patch size when computing or . The default is (as in He et al.'s DCP). Plausibly, larger windows may better smooth over noise or small variations for some architectures, while smaller windows can enhance localization of smoke for others.
- Smoothing factor : Prevents zero gradients through redefinition:
As , and the restoration formula reduces to a residual network. Empirical ablation over on VASST-desmoke indicates that yields the lowest RMSE.
5. Experimental Validation and Impact
SurgiATM's efficacy is demonstrated across three public surgical datasets:
- Cholec80: Synthetic smoke blended with clean frames from laparoscopic cholecystectomy ($500$ frames/video).
- VASST-desmoke: Real smoke, covering cholecystectomy, partial nephrectomy, and diaphragm dissection.
- Hamlyn: External, unpaired test set with diverse lighting and procedure conditions.
Ten baseline desmoking methods encompassing CNNs, U-Nets, cGANs, CycleGANs, GANs with perceptual/structural losses (e.g., SSIM-PAN), and Transformer-based models were evaluated. Training used five-fold cross-validation with Adam optimizer (learning rate ), for 50 epochs per split.
Quantitative results on VASST-desmoke show systematic improvements when applying SurgiATM as a postprocessing wrapper:
| Method | CIEDE2000↓ | PSNR↑ | RMSE↓ | SSIM↑ |
|---|---|---|---|---|
| AODNet | 10.35→6.33 | 17.38→21.99 | 0.147→0.090 | 0.693→0.789 |
| GCANet | 7.89→5.37 | 21.09→23.91 | 0.095→0.073 | 0.729→0.806 |
| De-smokeGCN | 9.58→7.35 | 19.78→22.13 | 0.115→0.094 | 0.550→0.717 |
| RSTN | 5.39→4.98 | 23.26→24.31 | 0.074→0.071 | 0.750→0.824 |
Performance improvements hold across synthetic-to-real (Cholec80→VASST) transfer, with CIEDE2000 reduced by units and PSNR increased by dB for all architectures, indicating enhanced generalizability.
Non-reference dehazing metrics (FADE) improve on Hamlyn, while BRISQUE/NIQE show mixed trends, plausibly influenced by domain gap with natural images.
6. Qualitative Observations and Operational Considerations
SurgiATM yields tangible improvements in stability and perceptual quality: grid artifacts in U-Net and Transformer models are suppressed, color casts in CNN architectures are corrected, and GAN-based outputs evidence less residual smoke. The generalizability of performance gains across both synthetic and real smoke domains, procedures, and institutions is empirically supported.
The min-filter and elementwise operations required by SurgiATM introduce negligible GPU latency, thus supporting deployment in time-sensitive, intraoperative applications. No architectural modifications to existing backbone networks are necessary.
7. Limitations and Future Directions
SurgiATM's gating mechanism constitutes a global approximation. Method-specific or learned gating functions—provided they avoid significant increases in parameter count or computational demand—may further improve restoration accuracy. The min-filter employed to estimate is inherently coarse; incorporating finer spatial priors (e.g., via guided filtering or learned spatial attention) could yield more accurate smoke localization and tighter coupling with DNN reflectance predictions.
Clinical evaluation in live surgical workflows and integration of temporal consistency for video remain open research areas.
SurgiATM establishes a generalizable, computationally lightweight approach for surgical smoke removal, systematically reducing restoration error (e.g., PSNR4–6 dB, RMSE30–40%, SSIM0.05–0.10) while being compatible with diverse deep learning desmoking paradigms and surgical imaging conditions.