Radio Galaxy Zoo: Towards building the first multi-purpose foundation model for radio astronomy with self-supervised learning (2305.16127v3)
Abstract: In this work, we apply self-supervised learning with instance differentiation to learn a robust, multi-purpose representation for image analysis of resolved extragalactic continuum images. We train a multi-use model which compresses our unlabelled data into a structured, low dimensional representation which can be used for a variety of downstream tasks (e.g. classification, similarity search). We exceed baseline supervised Fanaroff-Riley classification performance by a statistically significant margin, with our model reducing the test set error by up to half. Our model is also able to maintain high classification accuracy with very few labels, with only 7.79% error when only using 145 labels. We further demonstrate that by using our foundation model, users can efficiently trade off compute, human labelling cost and test set accuracy according to their respective budgets, allowing for efficient classification in a wide variety of scenarios. We highlight the generalizability of our model by showing that it enables accurate classification in a label scarce regime with data from the new MIGHTEE survey without any hyper-parameter tuning, where it improves upon the baseline by ~8%. Visualizations of our labelled and un-labelled data show that our model's representation space is structured with respect to physical properties of the sources, such as angular source extent. We show that the learned representation is scientifically useful even if no labels are available by performing a similarity search, finding hybrid sources in the RGZ DR1 data-set without any labels. We show that good augmentation design and hyper-parameter choice can help achieve peak performance, while emphasising that optimal hyper-parameters are not required to obtain benefits from self-supervised pre-training.
- Classifying radio galaxies with convolutional neural network.
- A Theoretical Analysis of Contrastive Unsupervised Representation Learning, in ICML.
- CNN architecture comparison for radio galaxy classification, Monthly Notices of the Royal Astronomical Society, 503(2), 1828–1846.
- The FIRST Survey: Faint Images of the Radio Sky at Twenty Centimeters, ApJ, 450, 559.
- MixMatch: A holistic approach to semi-supervised learning, in Neural Information Processing Systems (NeurIPS), vol. 32.
- On the Opportunities and Risks of Foundation Models, CoRR.
- Attention-gating for improved radio galaxy classification, Monthly Notices of the Royal Astronomical Society, 501(3).
- Emerging Properties in Self-Supervised Vision Transformers, in Proceedings of the IEEE International Conference on Computer Vision, pp. 9630–9640.
- A Simple Framework for Contrastive Learning of Visual Representations, in The International Conference on Machine Learning (ICML.
- Exploring Simple Siamese Representation Learning, in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15745–15753.
- Domain adaptation techniques for improved cross-domain study of galaxy mergers, in Machine Learning and the Physical Sciences - Workshop at the 34th Conference on Neural Information Processing Systems (NeurIPS).
- ImageNet: A large-scale hierarchical image database, IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255.
- The Square Kilometre Array, Proceedings of the IEEE, 97(8), 1482–1496.
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, ICLR.
- A new look at old friends - I. Imaging classical radio galaxies with uGMRT and MeerKAT, Monthly Notices of the Royal Astronomical Society, 505(4), 6003–6016.
- The Morphology of Extragalactic Radio Sources of High and Low Luminosity, Monthly Notices of the Royal Astronomical Society.
- A Concise Review of Transfer Learning, Proceedings - 2020 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 344–351.
- Bootstrap your own latent a new approach to self-supervised learning, in Advances in Neural Information Processing Systems, vol. 2020-Decem.
- Radio galaxies and feedback from AGN jets, New Astronomy Reviews, 88.
- Estimating Galactic Distances From Images Using Self-supervised Representation Learning, Third Workshop on Machine Learning and the Physical Sciences (35th Conference on Neural Information Processing Systems; NeurIPS2020).
- Self-supervised Representation Learning for Astronomical Images, The Astrophysical Journal Letters, 911(2), L33.
- Deep residual learning for image recognition, in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016-Decem, pp. 770–778.
- Momentum Contrast for Unsupervised Visual Representation Learning, in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 9726–9735.
- Masked Autoencoders Are Scalable Vision Learners, in CVPR, pp. 15979–15988.
- MIGHTEE: Total intensity radio continuum imaging and the COSMOS/XMM-LSS Early Science fields, Monthly Notices of the Royal Astronomical Society, 509(2), 2150–2168.
- A Survey on Contrastive Self-Supervised Learning, Technologies, 9(1), 2.
- The MeerKAT international GHz tiered extragalactic exploration (MIGHTEE) survey, in Proceedings of Science.
- The MeerKAT radio telescope, Proceedings of Science.
- Self-supervised Learning is More Robust to Dataset Imbalance, NeurIPS 2021 Workshop on Distribution Shifts: Connecting Methods and Applications.
- Radio Galaxy Zoo: Compact and extended radio source classification with deep learning, Monthly Notices of the Royal Astronomical Society.
- A semisupervised machine learning search for never-seen gravitational-wave sources, Monthly Notices of the Royal Astronomical Society, 500(4), 5408–5419.
- UMAP: Uniform manifold approximation and projection for dimension reduction, arXiv.
- Revisiting the fanaroff–riley dichotomy and radio-galaxy morphology with the LOFAR two-metre sky survey (LoTSS), Monthly Notices of the Royal Astronomical Society, 488(2), 2701–2721.
- How well do contrastively trained models transfer?, ICML 2022: The First Workshop on Pre-training.
- Quantifying uncertainty in deep learning approaches to radio galaxy classification, Monthly Notices of the Royal Astronomical Society, 511(3), 3722–3740.
- Combining lofar and apertif data for understanding the life cycle of radio galaxies, Galaxies, 9(4).
- Meta Pseudo Labels, IEEE Conference on Computer Vision and Pattern Recognition.
- MiraBest: A Dataset of Morphologically Classified Radio Galaxies for Machine Learning, RAS Techniques and Instruments.
- Semi-supervised learning for photometric supernova classification, Monthly Notices of the Royal Astronomical Society, 419(2), 1121–1135.
- Radio Galaxy Classification: #Tags, Not Boxes, Galaxies 2021, Vol. 9, Page 85, 9(4), 85.
- Fanaroff–Riley classification of radio galaxies using group-equivariant convolutional neural networks, Monthly Notices of the Royal Astronomical Society, 503(2), 2369–2379.
- LaplaceNet: A Hybrid Energy-Neural Model for Deep Semi-Supervised Classification, IEEE Transactions on Neural Networks and Learning Systems.
- How robust are pre-trained models to distribution shift?, ICML 2022: The First Workshop on Pre-training.
- Can semi-supervised learning reduce the amount of manual labelling required for effective radio galaxy morphology classification?, NeurIPS 2021: Machine Learning and the Physical Sciences Workshop.
- Radio Galaxy Zoo: using semi-supervised learning to leverage large unlabelled data sets for radio galaxy classification under data set shift, Monthly Notices of the Royal Astronomical Society, 514(2), 2599–2613.
- Learning useful representations for radio astronomy "in the wild" with contrastive learning, ICML 2022 Workshop on Machine Learning for Astrophysics.
- FixMatch: Simplifying semi-supervised learning with consistency and confidence, in Advances in Neural Information Processing Systems, vol. 2020-Decem.
- AstroVaDEr: Astronomical variational deep embedder for unsupervised morphological classification of galaxies and synthetic image generation, Monthly Notices of the Royal Astronomical Society, 502(1), 985–1007.
- Mining for strong gravitational lenses with self-supervised learning, arXiv e-prints.
- Self-supervised similarity search for large scientific datasets, Fourth Workshop on Machine Learning and the Physical Sciences (NeurIPS 2021).
- Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results, Advances in Neural Information Processing Systems, 2017-Decem, 1196–1205.
- What makes for good views for contrastive learning?, in Advances in Neural Information Processing Systems, vol. 2020-Decem, Neural information processing systems foundation.
- Visualizing data using t-SNE, Journal of Machine Learning Research, 9, 2579–2625.
- A survey on semi-supervised learning, Machine Learning, 109(2).
- Towards Galaxy Foundation Models with Hybrid Contrastive Learning, ICML 2022 Workshop on Machine Learning for Astrophysics.
- ResNet strikes back: An improved training procedure in timm, CoRR.
- Is Self-Supervised Learning More Robust Than Supervised Learning?, ICML 2022: The First Workshop on Pre-training.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.