AnatomyNet: Deep Learning for Fast and Fully Automated Whole-volume Segmentation of Head and Neck Anatomy (1808.05238v2)

Published 15 Aug 2018 in cs.CV, cs.LG, and cs.NE

Abstract: Methods: Our deep learning model, called AnatomyNet, segments OARs from head and neck CT images in an end-to-end fashion, receiving whole-volume HaN CT images as input and generating masks of all OARs of interest in one shot. AnatomyNet is built upon the popular 3D U-net architecture, but extends it in three important ways: 1) a new encoding scheme to allow auto-segmentation on whole-volume CT images instead of local patches or subsets of slices, 2) incorporating 3D squeeze-and-excitation residual blocks in encoding layers for better feature representation, and 3) a new loss function combining Dice scores and focal loss to facilitate the training of the neural model. These features are designed to address two main challenges in deep-learning-based HaN segmentation: a) segmenting small anatomies (i.e., optic chiasm and optic nerves) occupying only a few slices, and b) training with inconsistent data annotations with missing ground truth for some anatomical structures. Results: We collected 261 HaN CT images to train AnatomyNet, and used MICCAI Head and Neck Auto Segmentation Challenge 2015 as a benchmark dataset to evaluate the performance of AnatomyNet. The objective is to segment nine anatomies: brain stem, chiasm, mandible, optic nerve left, optic nerve right, parotid gland left, parotid gland right, submandibular gland left, and submandibular gland right. Compared to previous state-of-the-art results from the MICCAI 2015 competition, AnatomyNet increases Dice similarity coefficient by 3.3% on average. AnatomyNet takes about 0.12 seconds to fully segment a head and neck CT image of dimension 178 x 302 x 225, significantly faster than previous methods. In addition, the model is able to process whole-volume CT images and delineate all OARs in one pass, requiring little pre- or post-processing. https://github.com/wentaozhu/AnatomyNet-for-anatomical-segmentation.git.

PDF Abstract

Overview of AnatomyNet: Deep Learning for Fast and Fully Automated Whole-volume Segmentation of Head and Neck Anatomy

The paper under discussion presents AnatomyNet, a comprehensive and efficient deep learning framework designed for the automatic segmentation of head and neck (HaN) anatomy from CT images. The authors aim to address the challenges associated with radiation therapy planning for HaN cancer, which necessitates the precise delineation of organs-at-risk (OARs). Manual segmentation is not only labor-intensive but also time-consuming, which makes a compelling case for developing automated solutions.

Key Contributions of AnatomyNet

This work introduces several innovative enhancements to the traditional 3D U-Net architecture for semantic segmentation:

Whole-volume Segmentation: Unlike traditional methods that rely on analyzing image patches or subsets, AnatomyNet processes entire HaN CT volumes in one go, facilitating comprehensive and coherent segmentation.
Squeeze-and-Excitation Blocks: The model incorporates 3D squeeze-and-excitation (SE) residual blocks that improve feature representation by recalibrating learned feature maps. This innovation targets the better handling of small anatomical structures in the CT volume, which are otherwise challenging to delineate.
Improved Loss Function: AnatomyNet deploys a hybrid loss function that combines Dice scores with focal loss, addressing the class imbalance issues particularly prevalent when dealing with minor anatomical parts like the optic chiasm and optic nerves. This approach enhances the model's ability to accurately segment small-volumed structures.
Handling Inconsistent Annotations: By integrating a masked and weighted loss function, the framework accounts for inconsistent data annotations, a common occurrence in aggregated datasets from diverse sources. This inclusion helps the model effectively deal with missing ground truth in the training phase.

Data and Methodology

The authors conducted experiments with a dataset of 261 HaN CT images drawn from multiple public sources. They evaluated the model against the MICCAI Head and Neck Auto Segmentation Challenge 2015 dataset, which includes images that contain nine anatomies, all pertinent to HaN cancer RT.

Experimental Results

AnatomyNet outperforms existing state-of-the-art approaches, achieving a 3.3% improvement in the Dice similarity coefficient on average across all nine anatomies. Importantly, this method is computationally efficient, requiring only about 0.12 seconds to segment a full volume CT scan, which marks a significant improvement in processing speed compared to traditional atlas-based methods.

The model's success is particularly notable on small structures such as the optic nerves and optic chiasm, where segmentation accuracy is crucial for RT planning. However, the segmentation of anatomies like the mandible and parotid glands was less prone to errors.

Implications and Future Directions

This paper demonstrates the viability of deep learning methods such as U-Nets, in tackling complex medical image segmentation tasks. Anatomy's fully integrated architecture simplifies the segmentation pipeline, thereby potentially improving RT planning workflows.

Future work may focus on further enhancing the segmentation capabilities of AnatomyNet by incorporating spatial priors or shape models, which could address the limitations associated with voxel-wise loss functions in capturing the overall anatomical shapes. Expanding the diversity and volume of training datasets, as well as refining annotation consistency, could further boost model performance. Moreover, integrating more clinically relevant metrics could help tailor the segmentation performance towards practical clinical needs.

In conclusion, AnatomyNet represents a promising step forward in automated medical image processing, with its contributions extending beyond mere segmentation accuracy to encompass speed and the practicality required in clinical settings.

PDF Markdown Bookmark Chat (Pro)

Authors (9)

Wentao Zhu (73 papers)
Yufang Huang (10 papers)
Liang Zeng (31 papers)
Xuming Chen (4 papers)
Yong Liu (721 papers)
Zhen Qian (39 papers)
Nan Du (66 papers)
Wei Fan (160 papers)
Xiaohui Xie (84 papers)

Citations (413)

View on Semantic Scholar