Papers
Topics
Authors
Recent
Search
2000 character limit reached

SurgiATM: Surgical Smoke Removal Technique

Updated 15 November 2025
  • SurgiATM is a physics-guided module that combines atmospheric scattering models with deep neural architectures for effective surgical smoke removal.
  • It leverages the dark channel prior and a mixture-of-experts method to improve accuracy, stability, and generalizability without adding extra trainable parameters.
  • The approach uses only two hyperparameters and minimal computational overhead, enabling plug-and-play integration for real-time intraoperative applications.

The Surgical Atmospheric Model (SurgiATM) is a physics-guided, parameter-efficient module for surgical smoke removal that bridges the strengths of physics-based scattering models and deep neural architectures. Designed as a lightweight, plug-and-play wrapper, SurgiATM enhances the accuracy, stability, and generalizability of existing deep learning–based desmoking methods without introducing extra trainable weights or significant computational overhead. SurgiATM leverages the dark channel prior from classic dehazing, reframes model outputs via a statistically principled mixture-of-experts approach, and operates with only two hyperparameters, thereby facilitating real-time intraoperative deployment and broad compatibility across network architectures and surgical procedures (Sheng et al., 7 Nov 2025).

1. Theoretical Foundation: Physics-Based Atmospheric Scattering

SurgiATM is grounded in the classic Atmospheric Scattering Model (ASM), which analytically characterizes the formation of smoky or hazy images as a convex combination of direct scene radiance attenuated by the medium and an additive atmospheric veil. The image I(x,c)I(\mathbf{x},c) at pixel x\mathbf{x} and color channel cc is modeled as:

I(x,c)=J(x,c)t(x)+A(c)(1t(x))I(\mathbf{x},c) = J(\mathbf{x},c) \cdot t(\mathbf{x}) + A(c) \cdot (1 - t(\mathbf{x}))

where J(x,c)J(\mathbf{x},c) is true scene radiance, t(x)=exp(β(λ)d(x))t(\mathbf{x}) = \exp(-\beta(\lambda) \cdot d(\mathbf{x})) is the transmission determined by scattering coefficient β(λ)\beta(\lambda) and scene depth d(x)d(\mathbf{x}), and A(c)A(c) is the smoke veil intensity. The normalized radiance ρ(x,c)\rho(\mathbf{x},c) is such that x\mathbf{x}0.

To estimate x\mathbf{x}1, SurgiATM employs the Dark Channel Prior (DCP):

x\mathbf{x}2

with x\mathbf{x}3, and scene restoration via:

x\mathbf{x}4

2. Mixture-of-Experts: Statistical Bridge Between Physics and Deep Learning

Rather than relying exclusively on the DCP or a deep learning (DNN) prediction x\mathbf{x}5, SurgiATM treats both as "experts" in a mixture-of-experts (MoE) framework. Each predictor's error (denoted x\mathbf{x}6, x\mathbf{x}7) is modeled as a zero-mean Laplacian, conditioned on x\mathbf{x}8. An optimal pixelwise gate x\mathbf{x}9 is derived by minimizing the variance and squared bias of the mixture error:

cc0

with the closed-form optimal gate:

cc1

where cc2, cc3 are mean and scale parameters for DCP and DNN, respectively.

Empirically, cc4 is strongly anti-correlated with cc5 across architectures and datasets, motivating the practical approximation:

cc6

This removes the need for explicit veil estimation and simplifies computation.

3. Restoration Formula and Plug-in Integration

Substituting the approximated gate and leveraging cc7 yields the SurgiATM restoration formula:

cc8

which can be re-expressed as:

cc9

Defining the "denormalized dark channel" I(x,c)=J(x,c)t(x)+A(c)(1t(x))I(\mathbf{x},c) = J(\mathbf{x},c) \cdot t(\mathbf{x}) + A(c) \cdot (1 - t(\mathbf{x}))0, computable by a single min-filter operation, the update becomes:

I(x,c)=J(x,c)t(x)+A(c)(1t(x))I(\mathbf{x},c) = J(\mathbf{x},c) \cdot t(\mathbf{x}) + A(c) \cdot (1 - t(\mathbf{x}))1

In practice, given a pretrained or trainable desmoking network I(x,c)=J(x,c)t(x)+A(c)(1t(x))I(\mathbf{x},c) = J(\mathbf{x},c) \cdot t(\mathbf{x}) + A(c) \cdot (1 - t(\mathbf{x}))2 that estimates I(x,c)=J(x,c)t(x)+A(c)(1t(x))I(\mathbf{x},c) = J(\mathbf{x},c) \cdot t(\mathbf{x}) + A(c) \cdot (1 - t(\mathbf{x}))3 (typically via sigmoid activation), SurgiATM is appended as a post-processing arithmetic layer. The loss function becomes:

I(x,c)=J(x,c)t(x)+A(c)(1t(x))I(\mathbf{x},c) = J(\mathbf{x},c) \cdot t(\mathbf{x}) + A(c) \cdot (1 - t(\mathbf{x}))4

No extra trainable weights or modifications to network topology are introduced, aside from the minimal addition of I(x,c)=J(x,c)t(x)+A(c)(1t(x))I(\mathbf{x},c) = J(\mathbf{x},c) \cdot t(\mathbf{x}) + A(c) \cdot (1 - t(\mathbf{x}))5 computation and a pointwise multiply-subtract operation. This enables negligible latency overhead even in real-time clinical settings.

4. Hyperparameters: Window Size and Smoothing

SurgiATM introduces two hyperparameters:

  • Window size I(x,c)=J(x,c)t(x)+A(c)(1t(x))I(\mathbf{x},c) = J(\mathbf{x},c) \cdot t(\mathbf{x}) + A(c) \cdot (1 - t(\mathbf{x}))6: Governs the local patch size when computing I(x,c)=J(x,c)t(x)+A(c)(1t(x))I(\mathbf{x},c) = J(\mathbf{x},c) \cdot t(\mathbf{x}) + A(c) \cdot (1 - t(\mathbf{x}))7 or I(x,c)=J(x,c)t(x)+A(c)(1t(x))I(\mathbf{x},c) = J(\mathbf{x},c) \cdot t(\mathbf{x}) + A(c) \cdot (1 - t(\mathbf{x}))8. The default is I(x,c)=J(x,c)t(x)+A(c)(1t(x))I(\mathbf{x},c) = J(\mathbf{x},c) \cdot t(\mathbf{x}) + A(c) \cdot (1 - t(\mathbf{x}))9 (as in He et al.'s DCP). Plausibly, larger windows may better smooth over noise or small variations for some architectures, while smaller windows can enhance localization of smoke for others.
  • Smoothing factor J(x,c)J(\mathbf{x},c)0: Prevents zero gradients through redefinition:

J(x,c)J(\mathbf{x},c)1

As J(x,c)J(\mathbf{x},c)2, J(x,c)J(\mathbf{x},c)3 and the restoration formula reduces to a residual network. Empirical ablation over J(x,c)J(\mathbf{x},c)4 on VASST-desmoke indicates that J(x,c)J(\mathbf{x},c)5 yields the lowest RMSE.

5. Experimental Validation and Impact

SurgiATM's efficacy is demonstrated across three public surgical datasets:

  • Cholec80: Synthetic smoke blended with clean frames from laparoscopic cholecystectomy (J(x,c)J(\mathbf{x},c)6 frames/video).
  • VASST-desmoke: Real smoke, covering cholecystectomy, partial nephrectomy, and diaphragm dissection.
  • Hamlyn: External, unpaired test set with diverse lighting and procedure conditions.

Ten baseline desmoking methods encompassing CNNs, U-Nets, cGANs, CycleGANs, GANs with perceptual/structural losses (e.g., SSIM-PAN), and Transformer-based models were evaluated. Training used five-fold cross-validation with Adam optimizer (learning rate J(x,c)J(\mathbf{x},c)7), for 50 epochs per split.

Quantitative results on VASST-desmoke show systematic improvements when applying SurgiATM as a postprocessing wrapper:

Method CIEDE2000↓ PSNR RMSE↓ SSIM↑
AODNet 10.35→6.33 17.38→21.99 0.147→0.090 0.693→0.789
GCANet 7.89→5.37 21.09→23.91 0.095→0.073 0.729→0.806
De-smokeGCN 9.58→7.35 19.78→22.13 0.115→0.094 0.550→0.717
RSTN 5.39→4.98 23.26→24.31 0.074→0.071 0.750→0.824

Performance improvements hold across synthetic-to-real (Cholec80→VASST) transfer, with CIEDE2000 reduced by J(x,c)J(\mathbf{x},c)8 units and PSNR increased by J(x,c)J(\mathbf{x},c)9 dB for all architectures, indicating enhanced generalizability.

Non-reference dehazing metrics (FADE) improve on Hamlyn, while BRISQUE/NIQE show mixed trends, plausibly influenced by domain gap with natural images.

6. Qualitative Observations and Operational Considerations

SurgiATM yields tangible improvements in stability and perceptual quality: grid artifacts in U-Net and Transformer models are suppressed, color casts in CNN architectures are corrected, and GAN-based outputs evidence less residual smoke. The generalizability of performance gains across both synthetic and real smoke domains, procedures, and institutions is empirically supported.

The min-filter and elementwise operations required by SurgiATM introduce negligible GPU latency, thus supporting deployment in time-sensitive, intraoperative applications. No architectural modifications to existing backbone networks are necessary.

7. Limitations and Future Directions

SurgiATM's gating mechanism t(x)=exp(β(λ)d(x))t(\mathbf{x}) = \exp(-\beta(\lambda) \cdot d(\mathbf{x}))0 constitutes a global approximation. Method-specific or learned gating functions—provided they avoid significant increases in parameter count or computational demand—may further improve restoration accuracy. The min-filter employed to estimate t(x)=exp(β(λ)d(x))t(\mathbf{x}) = \exp(-\beta(\lambda) \cdot d(\mathbf{x}))1 is inherently coarse; incorporating finer spatial priors (e.g., via guided filtering or learned spatial attention) could yield more accurate smoke localization and tighter coupling with DNN reflectance predictions.

Clinical evaluation in live surgical workflows and integration of temporal consistency for video remain open research areas.

SurgiATM establishes a generalizable, computationally lightweight approach for surgical smoke removal, systematically reducing restoration error (e.g., PSNRt(x)=exp(β(λ)d(x))t(\mathbf{x}) = \exp(-\beta(\lambda) \cdot d(\mathbf{x}))24–6 dB, RMSEt(x)=exp(β(λ)d(x))t(\mathbf{x}) = \exp(-\beta(\lambda) \cdot d(\mathbf{x}))330–40%, SSIMt(x)=exp(β(λ)d(x))t(\mathbf{x}) = \exp(-\beta(\lambda) \cdot d(\mathbf{x}))40.05–0.10) while being compatible with diverse deep learning desmoking paradigms and surgical imaging conditions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Surgical Atmospheric Model (SurgiATM).