A Recycling Training Strategy for Medical Image Segmentation with Diffusion Denoising Models (2308.16355v3)
Abstract: Denoising diffusion models have found applications in image segmentation by generating segmented masks conditioned on images. Existing studies predominantly focus on adjusting model architecture or improving inference, such as test-time sampling strategies. In this work, we focus on improving the training strategy and propose a novel recycling method. During each training step, a segmentation mask is first predicted given an image and a random noise. This predicted mask, which replaces the conventional ground truth mask, is used for denoising task during training. This approach can be interpreted as aligning the training strategy with inference by eliminating the dependence on ground truth masks for generating noisy samples. Our proposed method significantly outperforms standard diffusion training, self-conditioning, and existing recycling strategies across multiple medical imaging data sets: muscle ultrasound, abdominal CT, prostate MR, and brain MR. This holds for two widely adopted sampling strategies: denoising diffusion probabilistic model and denoising diffusion implicit model. Importantly, existing diffusion models often display a declining or unstable performance during inference, whereas our novel recycling consistently enhances or maintains performance. We show that, under a fair comparison with the same network architectures and computing budget, the proposed recycling-based diffusion models achieved on-par performance with non-diffusion-based supervised training. By ensembling the proposed diffusion and the non-diffusion models, significant improvements to the non-diffusion models have been observed across all applications, demonstrating the value of this novel training method. This paper summarizes these quantitative results and discusses their values, with a fully reproducible JAX-based implementation, released at https://github.com/mathpluscode/ImgX-DiffSeg.
- Segdiff: Image segmentation with diffusion probabilistic models. arXiv preprint arXiv:2112.00390, 2021.
- Structured denoising diffusion models in discrete state-spaces. Advances in Neural Information Processing Systems, 34:17981–17993, 2021.
- The rsna-asnr-miccai brats 2021 benchmark on brain tumor segmentation and radiogenomic classification. arXiv preprint arXiv:2107.02314, 2021.
- Diffusion models for memory-efficient processing of 3d medical images. arXiv preprint arXiv:2303.15288, 2023.
- Berdiff: Conditional bernoulli diffusion model for medical image segmentation. arXiv preprint arXiv:2304.04429, 2023.
- A generalist framework for panoptic segmentation of images and videos. arXiv preprint arXiv:2210.06366, 2022a.
- Analog bits: Generating discrete data using diffusion models with self-conditioning. arXiv preprint arXiv:2208.04202, 2022b.
- Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems, 34:8780–8794, 2021.
- Three-dimensional medical image synthesis with denoising diffusion probabilistic models. In Medical Imaging with Deep Learning, 2022.
- Importance of aligning training strategy with evaluation for diffusion models in 3d multiclass segmentation. arXiv preprint arXiv:2303.06040, 2023.
- Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.
- Vector quantized diffusion model for text-to-image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10696–10706, 2022.
- Accelerating diffusion models via pre-segmentation diffusion sampling for medical image segmentation. arXiv preprint arXiv:2210.17408, 2022.
- Ssd-lm: Semi-autoregressive simplex-based diffusion language model for text generation and modular control. arXiv preprint arXiv:2210.17432, 2022.
- Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598, 2022.
- Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
- Argmax flows and multinomial diffusion: Learning categorical distributions. Advances in Neural Information Processing Systems, 34:12454–12465, 2021.
- Unsupervised denoising of retinal oct with diffusion probabilistic model. In Medical Imaging 2022: Image Processing, volume 12032, pages 25–34. SPIE, 2022.
- Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6(4), 2005.
- Amos: A large-scale abdominal multi-organ benchmark for versatile medical image segmentation. arXiv preprint arXiv:2206.08023, 2022.
- Diffusion models in medical imaging: A comprehensive survey. Medical Image Analysis, page 102846, 2023.
- Medical diffusion–denoising diffusion probabilistic models for 3d medical image generation. arXiv preprint arXiv:2211.03364, 2022.
- Diffusemorph: unsupervised deformable image registration using diffusion model. In European Conference on Computer Vision, pages 347–364. Springer, 2022.
- Variational diffusion models. Advances in neural information processing systems, 34:21696–21707, 2021.
- Segment anything. arXiv preprint arXiv:2304.02643, 2023.
- Multi-class segmentation from aerial views using recursive noise diffusion. arXiv preprint arXiv:2212.00787, 2022.
- Denoising diffusion semantic segmentation with mask prior modeling. arXiv preprint arXiv:2306.01721, 2023.
- Diffusion-lm improves controllable text generation. Advances in Neural Information Processing Systems, 35:4328–4343, 2022a.
- Prototypical few-shot segmentation for cross-institution male pelvic structures with spatial registration. arXiv preprint arXiv:2209.05160, 2022b.
- Pseudo numerical methods for diffusion models on manifolds. arXiv preprint arXiv:2202.09778, 2022.
- Accelerating diffusion models via early stop of the diffusion process. arXiv preprint arXiv:2205.12524, 2022.
- Deep learning segmentation of transverse musculoskeletal ultrasound images for neuromuscular disease assessment. Computers in Biology and Medicine, 135:104623, 2021.
- A morphology focused diffusion probabilistic model for synthesis of histopathology images. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2000–2009, 2023.
- Improved denoising diffusion probabilistic models. In International Conference on Machine Learning, pages 8162–8171. PMLR, 2021.
- Fast unsupervised brain anomaly detection and segmentation with diffusion models. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 705–714. Springer, 2022a.
- Brain imaging generation with latent diffusion models. arXiv preprint arXiv:2209.07162, 2022b.
- Ambiguous medical image segmentation using diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11536–11546, 2023.
- Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695, 2022.
- Bi-parametric prostate mr image synthesis using pathology and sequence-conditioned stable diffusion. arXiv preprint arXiv:2303.02094, 2023.
- Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning, pages 2256–2265. PMLR, 2015.
- Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020a.
- Generative modeling by estimating gradients of the data distribution. Advances in neural information processing systems, 32, 2019.
- Improved techniques for training score-based generative models. Advances in neural information processing systems, 33:12438–12448, 2020.
- Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020b.
- Self-conditioned embedding diffusion for text generation. arXiv preprint arXiv:2211.04236, 2022.
- Pascal Vincent. A connection between score matching and denoising autoencoders. Neural computation, 23(7):1661–1674, 2011.
- Dformer: Diffusion-guided transformer for universal image segmentation. arXiv preprint arXiv:2306.03437, 2023.
- Medical image segmentation using deep learning: A survey. IET Image Processing, 16(5):1243–1267, 2022.
- De novo design of protein structure and function with rfdiffusion. Nature, pages 1–3, 2023.
- Diffusion models for medical anomaly detection. In International Conference on Medical image computing and computer-assisted intervention, pages 35–45. Springer, 2022a.
- Diffusion models for implicit image segmentation ensembles. In International Conference on Medical Imaging with Deep Learning, pages 1336–1348. PMLR, 2022b.
- Medsegdiff: Medical image segmentation with diffusion probabilistic model. arXiv preprint arXiv:2211.00611, 2022.
- Medsegdiff-v2: Diffusion based medical image segmentation with transformer. arXiv preprint arXiv:2301.11798, 2023.
- Diff-unet: A diffusion embedded network for volumetric segmentation. arXiv preprint arXiv:2303.10326, 2023.
- Diffmic: Dual-guidance diffusion network for medical image classification. arXiv preprint arXiv:2303.10610, 2023.
- Sud: Supervision by denoising for medical image segmentation. arXiv preprint arXiv:2202.02952, 2022.
- Stochastic segmentation with conditional categorical diffusion models. arXiv preprint arXiv:2303.08888, 2023.
- Truncated diffusion probabilistic models. stat, 1050:7, 2022.