An Auto-Encoder Strategy for Adaptive Image Segmentation

Published 29 Apr 2020 in eess.IV and cs.CV | (2004.13903v1)

Abstract: Deep neural networks are powerful tools for biomedical image segmentation. These models are often trained with heavy supervision, relying on pairs of images and corresponding voxel-level labels. However, obtaining segmentations of anatomical regions on a large number of cases can be prohibitively expensive. Thus there is a strong need for deep learning-based segmentation tools that do not require heavy supervision and can continuously adapt. In this paper, we propose a novel perspective of segmentation as a discrete representation learning problem, and present a variational autoencoder segmentation strategy that is flexible and adaptive. Our method, called Segmentation Auto-Encoder (SAE), leverages all available unlabeled scans and merely requires a segmentation prior, which can be a single unpaired segmentation image. In experiments, we apply SAE to brain MRI scans. Our results show that SAE can produce good quality segmentations, particularly when the prior is good. We demonstrate that a Markov Random Field prior can yield significantly better results than a spatially independent prior. Our code is freely available at https://github.com/evanmy/sae.

Abstract PDF Upgrade to Chat

Citations (11)

View on Semantic Scholar

Summary

The paper presents an innovative SAE method that leverages variational autoencoders to achieve high segmentation accuracy with minimal supervision.
It employs a probabilistic atlas-based prior with an optional MRF component to enforce voxel neighborhood consistency in segmentation.
Experimental results on 3D brain MRI data show that SAE significantly outperforms naive baselines and is over ten times faster than traditional methods.

An Auto-Encoder Strategy for Adaptive Image Segmentation

The paper introduces an innovative approach to biomedical image segmentation using a variational autoencoder (VAE) strategy named Segmentation Auto-Encoder (SAE). The proposed SAE method addresses the challenges associated with traditional deep learning-based segmentation, particularly the extensive need for supervised data comprising paired images and corresponding voxel-level labels. Traditional methods are limited due to the high cost of obtaining manual segmentations and their sensitivity to variations in image characteristics.

Methodology and Technical Details

SAE formulates segmentation as a discrete representation learning problem, leveraging VAEs. Unlike conventional methods, this framework requires minimal supervision, using unpaired segmentation priors rather than paired datasets. The method utilizes a probabilistic atlas-based prior, which can either be spatial (probabilistically independent) or spatially dependent with a Markov Random Field (MRF) component to improve segmentation accuracy by accounting for voxel neighborhood relationships. This feature illustrates its adaptability by allowing the use of a single unpaired image to inform segmentation.

The architecture comprises an encoder and a decoder network. The encoder acts as a segmentation tool, while the decoder (reconstruction model) enforces consistency between the latent segmentation and the observed image. The encoder adopts a 3D U-Net configuration to approximate the posterior probability distribution, whereas the decoder produces the reconstructed image based on the segmentation. A crucial component is the use of the Gumbel-softmax trick to facilitate backpropagation during network training, enabling continuous optimization of the encoder and decoder parameters.

Experimental Results

The SAE framework was evaluated using T1-weighted 3D brain MRI data, with SAE variants being tested against both naive and advanced baselines such as maximum likelihood estimation (Expectation-Maximization) models and conventional neural networks trained with extensive datasets.

Quantitatively, the SAE approach delivered significant performance improvements over naive baseline models, achieving superior Dice coefficients, indicating better volumetric overlap with gold-standard manual segmentations. Moreover, including an MRF prior was shown to enhance segmentation accuracy further. Despite being unsupervised, the SAE models approached the performance of fully supervised learning setups, falling only around four percentage points short on Dice scores while being over ten times faster than traditional Expectation-Maximization methods.

Practical and Theoretical Implications

From a practical standpoint, the SAE framework offers a valuable solution for domains where labeled datasets are sparse or expensive to obtain. Its adaptability means it can capitalize on available unlabeled scans to produce effective segmentation predictions, potentially extending its utility to various imaging modalities and protocols. Theoretical implications suggest that segmentation can be re-envisioned through the lens of unsupervised discrete representation learning, a perspective that could stimulate further research into related fields in medical image analysis and beyond.

Future Directions

Future research possibilities include extending SAE to scenarios with dynamically moving anatomy, integrating spatial deformation models to allow non-rigid alignment with priors potentially using spatial transformer networks like VoxelMorph. Moreover, enhanced prior models, such as those leveraging adversarial learning paradigms, could further improve performance and adaptability. Given the results demonstrated, SAE sets a foundation for the evolution of sophisticated and efficient segmentation models, reducing reliance on heavily supervised data while improving robustness across a range of imaging scenarios.

Markdown