Combining Fully Convolutional and Recurrent Neural Networks for 3D Biomedical Image Segmentation (1609.01006v2)

Published 5 Sep 2016 in cs.CV

Abstract: Segmentation of 3D images is a fundamental problem in biomedical image analysis. Deep learning (DL) approaches have achieved state-of-the-art segmentation perfor- mance. To exploit the 3D contexts using neural networks, known DL segmentation methods, including 3D convolution, 2D convolution on planes orthogonal to 2D image slices, and LSTM in multiple directions, all suffer incompatibility with the highly anisotropic dimensions in common 3D biomedical images. In this paper, we propose a new DL framework for 3D image segmentation, based on a com- bination of a fully convolutional network (FCN) and a recurrent neural network (RNN), which are responsible for exploiting the intra-slice and inter-slice contexts, respectively. To our best knowledge, this is the first DL framework for 3D image segmentation that explicitly leverages 3D image anisotropism. Evaluating using a dataset from the ISBI Neuronal Structure Segmentation Challenge and in-house image stacks for 3D fungus segmentation, our approach achieves promising results comparing to the known DL-based 3D segmentation approaches.

Authors (5)

Jianxu Chen (24 papers)
Lin Yang (212 papers)
Yizhe Zhang (127 papers)
Mark Alber (8 papers)
Danny Z. Chen (72 papers)

Citations (303)

View on Semantic Scholar

Summary

Insights into 3D Biomedical Image Segmentation with Convolutional and Recurrent Neural Networks

The authors present an innovative deep learning (DL) framework that advances the segmentation of 3D biomedical images by synergistically employing fully convolutional networks (FCNs) and recurrent neural networks (RNNs). This paper introduces a method specifically designed to accommodate the unique anisotropy often present in 3D biomedical images, which traditional DL methodologies have struggled to address effectively.

Problem Context and Challenges

3D biomedical image segmentation enables the identification and analysis of complex structures, such as neurons and tissues. Existing DL-based methods either ignore the spatial correlation across slices or suffer from computational inefficiencies tied to full 3D convolutional architectures. These methods inadequately account for the anisotropic nature of medical imaging data, where voxel dimensions along different axes vary considerably, complicating the segmentation process.

Proposed Methodology

The proposed framework effectively harnesses the strengths of both FCNs and RNNs:

FCN Component ( $k$ U-Net): The authors enhance the traditional U-Net architecture, creating a multi-scale architecture termed $k$ U-Net. This variant uses multiple U-Nets aligned in a sequence, each processing images at different scales. This multi-scale approach addresses challenges linked to varying object sizes within slices, better capturing intra-slice features while maintaining computational feasibility.
RNN Component (BDC-LSTM): To handle inter-slice dependencies, the authors propose the Bi-Directional Convolutional Long Short-Term Memory (BDC-LSTM) network. This architecture applies convolutions in both forward and backward directions along the slice sequence. BDC-LSTM overcomes limitations of previous RNN-based methods, such as Pyramid-LSTM, by eschewing problematic isotropic kernels and delivering more nuanced context integration across slices.

Evaluation Results

The methodology was rigorously evaluated in two contexts: 3D neuron structures from the ISBI challenge and in-house 3D fungus structure datasets. The method achieved notably high segmentation accuracy benchmarks: a $V_{rand}$ score of 0.9753 and a $V_{info}$ score of 0.9870 on the neuron dataset, demonstrating improvement over previous state-of-the-art methods. The proposed framework also showed strong performance on the fungus dataset with a pixel error of 0.0215, underscoring its versatility across datasets with different anisotropic characteristics.

Implications and Future Work

This paper provides a significant contribution to biomedical image analysis by demonstrating a novel integration of FCNs and RNNs to improve 3D segmentation. Practically, the approach facilitates more accurate biological and medical image analyses, supporting advancements in medical diagnosis and research.

Theoretically, this framework proposes a new paradigm for leveraging 2D DL architectures within three-dimensional data contexts, potentially inspiring novel approaches to other forms of 3D data beyond biomedical applications. Future research directions include exploring deeper architectures of BDC-LSTM and applying this framework to broader datasets such as BraTS or MRBrainS, which would further validate its efficacy and extend its applicability in diverse medical imaging domains. The paper makes an understated yet compelling case for continued innovation in DL frameworks tailored to specific data characteristics, encouraging future explorations in handling complex anisotropy in imaging datasets.

PDF Markdown

Related Papers

Find Related Papers