MGAug: Multimodal Geometric Augmentation in Latent Spaces of Image Deformations (2312.13440v3)
Abstract: Geometric transformations have been widely used to augment the size of training images. Existing methods often assume a unimodal distribution of the underlying transformations between images, which limits their power when data with multimodal distributions occur. In this paper, we propose a novel model, Multimodal Geometric Augmentation (MGAug), that for the first time generates augmenting transformations in a multimodal latent space of geometric deformations. To achieve this, we first develop a deep network that embeds the learning of latent geometric spaces of diffeomorphic transformations (a.k.a. diffeomorphisms) in a variational autoencoder (VAE). A mixture of multivariate Gaussians is formulated in the tangent space of diffeomorphisms and serves as a prior to approximate the hidden distribution of image transformations. We then augment the original training dataset by deforming images using randomly sampled transformations from the learned multimodal latent space of VAE. To validate the efficiency of our model, we jointly learn the augmentation strategy with two distinct domain-specific tasks: multi-class classification on 2D synthetic datasets and segmentation on real 3D brain magnetic resonance images (MRIs). We also compare MGAug with state-of-the-art transformation-based image augmentation algorithms. Experimental results show that our proposed approach outperforms all baselines by significantly improved prediction accuracy. Our code is publicly available at https://github.com/tonmoy-hossain/MGAug.
- Self-supervised augmentation consistency for adapting semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15384–15394.
- Arnold, V. 1966. Sur la géométrie différentielle des groupes de Lie de dimension infinie et ses applications à l’hydrodynamique des fluides parfaits. In Annales de l’institut Fourier, volume 16, 319–361.
- Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain. Medical image analysis, 12(1): 26–41.
- Computing large deformation metric mappings via geodesic flows of diffeomorphisms. International journal of computer vision, 61(2): 139–157.
- Enhancing MR image segmentation with realistic adversarial data augmentation. Medical Image Analysis, 82: 102597.
- Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306.
- Autoaugment: Learning augmentation strategies from data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 113–123.
- Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 702–703.
- Unsupervised learning of probabilistic diffeomorphic registration for images and surfaces. Medical image analysis, 57: 226–236.
- The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism. Molecular psychiatry, 19(6): 659–667.
- Dice, L. R. 1945. Measures of the amount of ecologic association between species. Ecology, 26(3): 297–302.
- Deep unsupervised clustering with gaussian mixture variational autoencoders. arXiv preprint arXiv:1611.02648.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
- Enabling data diversity: efficient automatic augmentation via regularized adversarial training. In International Conference on Information Processing in Medical Imaging, 85–97. Springer.
- Dreaming more data: Class-dependent distributions over diffeomorphisms for learned data augmentation. In Artificial intelligence and statistics, 342–350. PMLR.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
- Augmix: A simple data processing method to improve robustness and uncertainty. arXiv preprint arXiv:1912.02781.
- Denoising diffusion probabilistic models. Advances in neural information processing systems, 33: 6840–6851.
- Brain tumor detection using convolutional neural network. In 2019 1st international conference on advances in science, engineering and robotics technology (ICASERT), 1–6. IEEE.
- Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 4700–4708.
- SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and¡ 0.5 MB model size. arXiv preprint arXiv:1602.07360.
- The Alzheimer’s disease neuroimaging initiative (ADNI): MRI methods. Journal of Magnetic Resonance Imaging: An Official Journal of the International Society for Magnetic Resonance in Medicine, 27(4): 685–691.
- Spatial transformer networks. Advances in neural information processing systems, 28.
- SADIR: Shape-Aware Diffusion Models for 3D Image Reconstruction. In International Workshop on Shape in Medical Imaging, 287–300. Springer.
- Variational deep embedding: An unsupervised and generative approach to clustering. arXiv preprint arXiv:1611.05148.
- Unbiased Diffeomorphic Atlas Construction for Computational Anatomy. NeuroImage, 23, Supplement1: 151–160.
- A shape preserving approach for salient object detection using convolutional neural networks. In Pattern Recognition (ICPR), 2016 23rd International Conference on, 609–614. IEEE.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.
- Learning a probabilistic model for diffeomorphic registration. IEEE transactions on medical imaging, 38(9): 2165–2176.
- Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6): 84–90.
- OASIS-3: longitudinal neuroimaging, clinical, and cognitive dataset for normal aging and Alzheimer disease. MedRxiv.
- LeCun, Y. 1998. The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/.
- Fast autoaugment. Advances in Neural Information Processing Systems, 32.
- Geodesic shooting for computational anatomy. Journal of mathematical imaging and vision, 24(2): 209–228.
- Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784.
- Parametric non-rigid registration using a stationary velocity field. In 2012 IEEE Workshop on Mathematical Methods in Biomedical Image Analysis, 145–150. IEEE.
- Forward noise adjustment scheme for data augmentation. In 2018 IEEE symposium series on computational intelligence (SSCI), 728–734. IEEE.
- Trivialaugment: Tuning-free yet state-of-the-art data augmentation. In Proceedings of the IEEE/CVF international conference on computer vision, 774–782.
- GLASS: Geometric latent augmentation for shape spaces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18552–18561.
- Numerical optimization. Springer.
- Adversarial data augmentation via deformation statistics. In European Conference on Computer Vision, 643–659. Springer.
- Libra r-cnn: Towards balanced learning for object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 821–830.
- U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, 234–241. Springer.
- Recognition of shapes by editing their shock graphs. IEEE Transactions on pattern analysis and machine intelligence, 26(5): 550–571.
- Construction of a 3D probabilistic atlas of human cortical structures. Neuroimage, 39(3): 1064–1080.
- Anatomical data augmentation via fluid-based image registration. In International Conference on Medical Image Computing and Computer-Assisted Intervention, 318–328. Springer.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
- Effective data augmentation with diffusion models. arXiv preprint arXiv:2302.07944.
- Diffeomorphic 3D image registration via geodesic shooting using an efficient adjoint calculation. International Journal of Computer Vision, 97(2): 229–241.
- Geo-SIC: Learning Deformable Geometric Shapes in Deep Image Classifiers. arXiv preprint arXiv:2210.13704.
- Multi-modal volume registration by maximization of mutual information. Medical image analysis, 1(1): 35–51.
- Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF international conference on computer vision, 6023–6032.
- Probabilistic principal geodesic analysis. Advances in neural information processing systems, 26.
- A mixture model for automatic diffeomorphic multi-atlas building. In Med. Image Comput. Comput.-Assisted Intervention Workshop-BAMBI.
- Bayesian estimation of regularization and atlas building in diffeomorphic image registration. In International conference on information processing in medical imaging, 37–48. Springer.
- Data augmentation using learned transformations for one-shot medical image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 8543–8553.
- Unet++: A nested u-net architecture for medical image segmentation. In Deep learning in medical image analysis and multimodal learning for clinical decision support, 3–11. Springer.
- Learning data augmentation strategies for object detection. In European conference on computer vision, 566–583. Springer.