Local Spatiotemporal Representation Learning for Longitudinally-consistent Neuroimage Analysis (2206.04281v4)
Abstract: Recent self-supervised advances in medical computer vision exploit global and local anatomical self-similarity for pretraining prior to downstream tasks such as segmentation. However, current methods assume i.i.d. image acquisition, which is invalid in clinical study designs where follow-up longitudinal scans track subject-specific temporal changes. Further, existing self-supervised methods for medically-relevant image-to-image architectures exploit only spatial or temporal self-similarity and only do so via a loss applied at a single image-scale, with naive multi-scale spatiotemporal extensions collapsing to degenerate solutions. To these ends, this paper makes two contributions: (1) It presents a local and multi-scale spatiotemporal representation learning method for image-to-image architectures trained on longitudinal images. It exploits the spatiotemporal self-similarity of learned multi-scale intra-subject features for pretraining and develops several feature-wise regularizations that avoid collapsed identity representations; (2) During finetuning, it proposes a surprisingly simple self-supervised segmentation consistency regularization to exploit intra-subject correlation. Benchmarked in the one-shot segmentation setting, the proposed framework outperforms both well-tuned randomly-initialized baselines and current self-supervised techniques designed for both i.i.d. and longitudinal datasets. These improvements are demonstrated across both longitudinal neurodegenerative adult MRI and developing infant brain MRI and yield both higher performance and longitudinal consistency.
- Multi-atlas based segmentation of brain images: atlas selection and its effect on accuracy. Neuroimage, 46(3):726–738, 2009.
- Semi-supervised semantic segmentation with pixel-level contrastive learning from a class-wise memory bank. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 8219–8228, October 2021.
- Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain. Medical image analysis, 12(1):26–41, 2008.
- The optimal template effect in hippocampus studies of diseased populations. Neuroimage, 49(3):2457–2466, 2010.
- Can temporal information help with contrastive self-supervised learning? arXiv preprint arXiv:2011.13046, 2020.
- Vicreg: Variance-invariance-covariance regularization for self-supervised learning. arXiv preprint arXiv:2105.04906, 2021.
- Robust segmentation of brain mri in the wild with hierarchical cnns and no retraining. arXiv preprint arXiv:2203.01969, 2022.
- Synthseg: Domain randomisation for segmentation of brain mri scans of any contrast and resolution. arXiv preprint arXiv:2107.09559, 2021.
- Contrastive learning of global and local features for medical image segmentation with limited annotations. Advances in Neural Information Processing Systems, 33:12546–12558, 2020.
- Self-supervised learning for medical image analysis using image context restoration. Medical image analysis, 58:101539, 2019.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
- Big self-supervised models are strong semi-supervised learners. Advances in neural information processing systems, 33:22243–22255, 2020.
- Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15750–15758, June 2021.
- Self-supervised spatio-temporal representation learning using variable playback speed prediction. arXiv preprint arXiv:2003.02692, 8, 2020.
- Longitudinal self-supervision to disentangle inter-patient variability from disease progression. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 231–241. Springer, 2021.
- Contrareg: Contrastive learning of multi-modality unsupervised deformable image registration. In Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. Springer, 2022.
- Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain. Neuron, 33(3):341–355, 2002.
- Fully convolutional structured lstm networks for joint 4d medical image segmentation. In 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pages 1104–1108. IEEE, 2018.
- Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728, 2018.
- Bootstrap your own latent-a new approach to self-supervised learning. Advances in Neural Information Processing Systems, 33:21271–21284, 2020.
- Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pages 297–304. JMLR Workshop and Conference Proceedings, 2010.
- Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020.
- Olivier Henaff. Data-efficient image recognition with contrastive predictive coding. In International Conference on Machine Learning, pages 4182–4192. PMLR, 2020.
- Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670, 2018.
- Region-aware contrastive learning for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 16291–16301, October 2021.
- Boosting contrastive self-supervised learning with false negative cancellation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2785–2795, 2022.
- Multi-atlas segmentation of biomedical images: a survey. Medical image analysis, 24(1):205–219, 2015.
- Bayesian longitudinal segmentation of hippocampal substructures in brain mri using subject-specific atlases. Neuroimage, 141:542–555, 2016.
- Time-equivariant contrastive video representation learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 9970–9980, October 2021.
- Understanding dimensional collapse in contrastive self-supervised learning. In International Conference on Learning Representations, 2022.
- Constrained-cnn losses for weakly supervised segmentation. Medical image analysis, 54:88–99, 2019.
- Supervised contrastive learning. Advances in Neural Information Processing Systems, 33:18661–18673, 2020.
- Adaptive prior probability and spatial temporal intensity change estimation for segmentation of the one-year-old human brain. Journal of neuroscience methods, 212(1):43–55, 2013.
- Oasis-3: longitudinal neuroimaging, clinical, and cognitive dataset for normal aging and alzheimer disease. MedRxiv, 2019.
- Supervised contrastive embedding for medical image segmentation. IEEE Access, 9:138403–138414, 2021.
- Longitudinal diffusion mri analysis using segis-net: a single-step deep-learning framework for simultaneous segmentation and registration. NeuroImage, 235:118004, 2021.
- Point-supervised segmentation of microscopy images and volumes via objectness regularization. In 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), pages 1558–1562. IEEE, 2021.
- Bootstrapping semantic segmentation with regional contrast. In International Conference on Learning Representations, 2022.
- Fast and robust multi-atlas segmentation of brain magnetic resonance images. Neuroimage, 49(3):2352–2365, 2010.
- V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 2016 fourth international conference on 3D vision (3DV), pages 565–571. IEEE, 2016.
- Andriy Myronenko. 3d mri brain tumor segmentation using autoencoder regularization. In International MICCAI Brainlesion Workshop, pages 311–320. Springer, 2018.
- Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018.
- Self-supervised longitudinal neighbourhood embedding. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 80–89. Springer, 2021.
- Contrastive learning for unpaired image-to-image translation. In European Conference on Computer Vision, pages 319–345. Springer, 2020.
- Context encoders: Feature learning by inpainting. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2536–2544, 2016.
- Fast and sequence-adaptive whole-brain segmentation using parametric bayesian modeling. NeuroImage, 143:235–249, 2016.
- Within-subject template estimation for unbiased longitudinal image analysis. Neuroimage, 61(4):1402–1418, 2012.
- U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015.
- Connect, not collapse: Explaining contrastive learning for unsupervised domain adaptation. arXiv preprint arXiv:2204.00570, 2022.
- Subcortical brain development in autism and fragile x syndrome: evidence for dynamic, age-and disorder-specific trajectories in infancy. American Journal of Psychiatry, pages appi–ajp, 2022.
- Neonatal brain image segmentation in longitudinal mri studies. Neuroimage, 49(1):391–400, 2010.
- Infant brain atlases from neonates to 1-and 2-year-olds. PloS one, 6(4):e18746, 2011.
- Multi-site infant brain segmentation algorithms: The iseg-2019 challenge. IEEE Transactions on Medical Imaging, 40(5):1363–1376, 2021.
- Subcortical brain and behavior phenotypes differentiate infants with autism versus language delay. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, 2(8):664–672, 2017.
- Contrastive multiview coding. In European conference on computer vision, pages 776–794. Springer, 2020.
- Self-supervised lesion change detection and localisation in longitudinal multiple sclerosis brain imaging. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 670–680. Springer, 2021.
- Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th international conference on Machine learning, pages 1096–1103, 2008.
- Multi-atlas segmentation with joint label fusion. IEEE transactions on pattern analysis and machine intelligence, 35(3):611–623, 2012.
- Self-supervised video representation learning by pace prediction. In European conference on computer vision, pages 504–521. Springer, 2020.
- Exploring cross-image pixel contrast for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 7303–7313, October 2021.
- Consistent segmentation of longitudinal brain mr images with spatio-temporal constrained networks. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 89–98. Springer, 2021.
- Barlow twins: Self-supervised learning via redundancy reduction. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 12310–12320. PMLR, 18–24 Jul 2021.
- Positional contrastive learning for volumetric medical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 221–230. Springer, 2021.
- Multi-stream 3d fcn with multi-scale deep supervision for multi-modality isointense infant brain mr image segmentation. In 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pages 136–140. IEEE, 2018.
- Split-brain autoencoders: Unsupervised learning by cross-channel prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1058–1067, 2017.
- Longitudinal correlation analysis for decoding multi-modal brain development. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 400–409. Springer, 2021.
- Contrastive learning for label efficient semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 10623–10633, October 2021.
- Pixel contrastive-consistent semi-supervised semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 7273–7282, October 2021.