Sensorless Freehand 3D Ultrasound Reconstruction via Deep Contextual Learning (2006.07694v1)

Published 13 Jun 2020 in cs.CV and eess.IV

Abstract: Transrectal ultrasound (US) is the most commonly used imaging modality to guide prostate biopsy and its 3D volume provides even richer context information. Current methods for 3D volume reconstruction from freehand US scans require external tracking devices to provide spatial position for every frame. In this paper, we propose a deep contextual learning network (DCL-Net), which can efficiently exploit the image feature relationship between US frames and reconstruct 3D US volumes without any tracking device. The proposed DCL-Net utilizes 3D convolutions over a US video segment for feature extraction. An embedded self-attention module makes the network focus on the speckle-rich areas for better spatial movement prediction. We also propose a novel case-wise correlation loss to stabilize the training process for improved accuracy. Highly promising results have been obtained by using the developed method. The experiments with ablation studies demonstrate superior performance of the proposed method by comparing against other state-of-the-art methods. Source code of this work is publicly available at https://github.com/DIAL-RPI/FreehandUSRecon.

Authors (4)

Hengtao Guo (10 papers)
Sheng Xu (106 papers)
Bradford Wood (7 papers)
Pingkun Yan (55 papers)

Citations (35)

View on Semantic Scholar

Summary

The paper introduces DCL-Net, a novel approach that reconstructs freehand 3D ultrasound images without sensing devices.
It employs a modified 3D ResNext architecture with self-attention and a case-wise correlation loss to capture spatial and temporal relationships.
Experiments on 640 TRUS videos report an average distance error of 10.33 mm, demonstrating enhanced accuracy and reduced hardware complexity.

Sensorless Freehand 3D Ultrasound Reconstruction via Deep Contextual Learning

The paper "Sensorless Freehand 3D Ultrasound Reconstruction via Deep Contextual Learning" presents an innovative approach to reconstruct 3D ultrasound (US) volumes from freehand ultrasound scans without the aid of tracking devices. This work is particularly significant given the widespread use of ultrasound imaging in interventional procedures, such as transrectal ultrasound (TRUS) for prostate cancer diagnosis, where accurate 3D visualization can enhance the correlation with magnetic resonance imaging (MRI) and improve diagnostic outcomes.

Methodology Overview

The authors propose the Deep Contextual Learning Network (DCL-Net), which leverages deep learning techniques to exploit the spatial and temporal relationships between ultrasound frames. Unlike traditional methods that rely on external tracking devices, DCL-Net achieves 3D reconstruction through the use of 3D convolutions and a self-attention mechanism that identifies speckle-rich areas within ultrasound images to predict spatial movements. The network is trained to optimize a novel case-wise correlation loss, which enhances the discriminative feature learning and mitigates overfitting to specific scanning styles.

The architecture of DCL-Net is built upon a modified version of the 3D ResNext model, utilizing 3D residual blocks and 3D convolutional kernels to capture temporal dependencies across a sequence of ultrasound frames. By utilizing the mean of motion vectors over consecutive frames during training, the framework effectively smooths motion estimates and reduces noise sensitivity.

Experimental Evaluation

The authors conducted extensive experiments on a large dataset of 640 TRUS videos, which were collected using an electromagnetic tracking device as ground truth. The dataset was divided into training, validation, and testing subsets. The evaluation metrics included the average distance error between the frame corner-points and the final drift, measuring the accuracy of the endpoint motion estimation.

The experimental results demonstrate that DCL-Net significantly outperforms existing methods, including those based on decorrelation techniques and 2D CNN models. Notably, the DCL-Net achieved an average distance error of 10.33 mm, showcasing improved stability and accuracy in reconstructing ultrasound volumes without the constraints of tracking devices. The integration of the attention module and case-wise correlation loss is shown to enhance the network's ability to capture motion variations and improve reconstruction fidelity.

Implications and Future Directions

The implications of this research are profound, particularly in clinical settings where the reduction of hardware complexity and cost is advantageous. DCL-Net presents a viable solution for real-time, sensorless 3D ultrasound reconstruction, which can facilitate more flexible and cost-effective interventional procedures.

The methodology outlined in this paper opens several avenues for future exploration. Further validation on different ultrasound imaging protocols and anatomical sites could broaden the applicability of this approach. Additionally, refining the network to address challenges in varying image qualities and motion patterns remains an area for ongoing research.

In conclusion, the development of the DCL-Net represents an important step forward for the field of medical imaging, offering enhanced capabilities for 3D reconstruction in a sensorless, efficient manner. This advancement not only holds promise for improving diagnostic accuracy but also for expanding the reach of ultrasound imaging in clinical practice.

PDF Markdown

Related Papers

GitHub

GitHub - DIAL-RPI/FreehandUSRecon: Source code for DCL-Net, a deep learning model for sensorless freehand 3D ultrasound volume reconstruction. (90 stars)