Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SurgeonAssist-Net: Towards Context-Aware Head-Mounted Display-Based Augmented Reality for Surgical Guidance (2107.06397v1)

Published 13 Jul 2021 in cs.CV

Abstract: We present SurgeonAssist-Net: a lightweight framework making action-and-workflow-driven virtual assistance, for a set of predefined surgical tasks, accessible to commercially available optical see-through head-mounted displays (OST-HMDs). On a widely used benchmark dataset for laparoscopic surgical workflow, our implementation competes with state-of-the-art approaches in prediction accuracy for automated task recognition, and yet requires 7.4x fewer parameters, 10.2x fewer floating point operations per second (FLOPS), is 7.0x faster for inference on a CPU, and is capable of near real-time performance on the Microsoft HoloLens 2 OST-HMD. To achieve this, we make use of an efficient convolutional neural network (CNN) backbone to extract discriminative features from image data, and a low-parameter recurrent neural network (RNN) architecture to learn long-term temporal dependencies. To demonstrate the feasibility of our approach for inference on the HoloLens 2 we created a sample dataset that included video of several surgical tasks recorded from a user-centric point-of-view. After training, we deployed our model and cataloged its performance in an online simulated surgical scenario for the prediction of the current surgical task. The utility of our approach is explored in the discussion of several relevant clinical use-cases. Our code is publicly available at https://github.com/doughtmw/surgeon-assist-net.

Citations (17)

Summary

  • The paper introduces SurgeonAssist-Net, a lightweight CNN-RNN framework that enables near real-time, context-aware surgical guidance on OST-HMDs.
  • It employs EfficientNet-Lite-B0 for spatial feature extraction and a GRU for temporal modeling, achieving 7.4× fewer parameters and a threefold speed increase over prior models.
  • The research underscores the framework's potential in enhancing surgical training and evaluations through portable, ONNX-based AR assistance in clinical settings.

Overview of "SurgeonAssist-Net: Towards Context-Aware Head-Mounted Display-Based Augmented Reality for Surgical Guidance"

The paper presents SurgeonAssist-Net, an efficient and lightweight framework designed to facilitate context-aware surgical guidance via augmented reality (AR) on optical see-through head-mounted displays (OST-HMDs), such as the Microsoft HoloLens 2. The system aims to provide virtual assistance for predefined surgical tasks by integrating computer vision and machine learning techniques. This approach promises to enhance the usability and effectiveness of AR-guided surgical interventions while minimizing distractions traditionally associated with OST-HMDs.

Methodology

SurgeonAssist-Net leverages a combination of a convolutional neural network (CNN) and a recurrent neural network (RNN) to achieve context-aware predictions of surgical tasks. The CNN component utilizes EfficientNet-Lite-B0, an optimized version for mobile CPUs, to extract spatial features from video input. In parallel, a gated recurrent unit (GRU) RNN is employed to model temporal dependencies, capturing sequential data vital for recognizing surgical phases.

The paper details the system's implementation on the HoloLens 2, enabling near real-time inference and visualization of surgical task predictions directly on the display. The integration uses Windows Machine Learning and Open Neural Net Exchange (ONNX) libraries to process the CNN and RNN models efficiently.

Results

The operational efficiency of SurgeonAssist-Net was benchmarked using the Cholec80 dataset, a widely recognized resource for evaluating surgical task recognition systems. The framework demonstrated comparable prediction accuracy to state-of-the-art models while significantly reducing parameter count and computational burden. Specifically, it achieved 7.4 times fewer parameters and delivered a threefold speed increase in CPU inference compared to SV-RCNet, highlighting its suitability for applications within a computationally constrained OST-HMD environment.

In a user-centric evaluation, SurgeonAssist-Net maintained comparable performance post-ONNX conversion, underscoring the portability and adaptability of the model for real-world deployment scenarios.

Implications and Future Directions

The SurgeonAssist-Net framework addresses critical challenges in AR-assisted surgical guidance by aligning the presentation of virtual data with the current surgical context. Its application could extend to various domains, including surgical training and procedure evaluation, thereby fostering more intuitive and adaptive surgical environments.

For future research, the authors highlight the potential advantages of augmenting the dataset with a broader range of surgical tasks and incorporating user studies to further quantify SurgeonAssist-Net's impact on surgical performance and user satisfaction. Expanding the system's capability to handle more diverse procedures and integrating feedback from clinical deployments will be crucial steps toward widespread adoption.

In summary, this work exhibits a pragmatic approach to enhancing AR-guided surgery, leveraging efficient neural architectures to enable context-aware and lightweight implementations on commercially available OST-HMDs, thus propelling the integration of AR technology into mainstream surgical practice.

Youtube Logo Streamline Icon: https://streamlinehq.com