Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology (2404.10242v1)
Abstract: Featurizing microscopy images for use in biological research remains a significant challenge, especially for large-scale experiments spanning millions of images. This work explores the scaling properties of weakly supervised classifiers and self-supervised masked autoencoders (MAEs) when training with increasingly larger model backbones and microscopy datasets. Our results show that ViT-based MAEs outperform weakly supervised classifiers on a variety of tasks, achieving as much as a 11.5% relative improvement when recalling known biological relationships curated from public databases. Additionally, we develop a new channel-agnostic MAE architecture (CA-MAE) that allows for inputting images of different numbers and orders of channels at inference time. We demonstrate that CA-MAEs effectively generalize by inferring and evaluating on a microscopy image dataset (JUMP-CP) generated under different experimental conditions with a different channel structure than our pretraining data (RPI-93M). Our findings motivate continued research into scaling self-supervised learning on microscopy data in order to create powerful foundation models of cellular biology that have the potential to catalyze advancements in drug discovery and beyond.
- Improving Phenotypic Measurements in High-Content Imaging Screens. bioRxiv, page 161422, 2017.
- Multimae: Multi-modal multi-task masked autoencoders. In European Conference on Computer Vision, pages 348–367. Springer, 2022.
- A Cookbook of Self-Supervised Learning. arXiv, 2023.
- Applications of crispr technologies in research and beyond. Nature biotechnology, 34(9):933–941, 2016.
- Microscopy-Based High-Content Screening. Cell, 163(6):1314–1325, 2015.
- Cell Painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes. Nature Protocols, 11(9):1757–1774, 2016.
- Data-analysis strategies for image-based cell profiling. Nature Methods, 14(9):849–863, 2017.
- Weakly Supervised Learning of Single-Cell Feature Embeddings. bioRxiv, page 293431, 2018.
- Emerging Properties in Self-Supervised Vision Transformers. arXiv, 2021.
- CellProfiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biology, 7(10):R100, 2006.
- Biological cartography: Building and benchmarking representations of life. In NeurIPS 2022 Workshop on Learning Meaningful Representations of Life, 2022.
- Image-based profiling for drug discovery: due for a machine-learning upgrade? Nature Reviews Drug Discovery, 20(2):145–159, 2021.
- Three million images and morphological profiles of cells treated with matched chemical and genetic perturbations. Biorxiv, pages 2022–01, 2022.
- Jump cell painting dataset: morphological impact of 136,000 chemical and genetic perturbations. bioRxiv, pages 2023–03, 2023.
- A Simple Framework for Contrastive Learning of Visual Representations. arXiv, 2020.
- Symbolic discovery of optimization algorithms. arXiv preprint arXiv:2302.06675, 2023.
- Self-Supervised Learning of Phenotypic Representations from Cell Images with Weak Labels. arXiv, 2022.
- Flashattention: Fast and memory-efficient exact attention with io-awareness. Advances in Neural Information Processing Systems, 35:16344–16359, 2022.
- Scaling vision transformers to 22 billion parameters. In International Conference on Machine Learning, pages 7480–7512. PMLR, 2023.
- Unbiased single-cell morphology with self-supervised vision transformers. bioRxiv, pages 2023–06, 2023.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR), 2020.
- Integration of over 9,000 mass spectrometry experiments builds a global map of human protein complexes. Molecular Systems Biology, 13(6):932, 2017.
- Reconstructing cell cycle and disease progression using deep learning. Nature Communications, 8(1):463, 2017.
- Rxrx3: Phenomics map of biology. bioRxiv, pages 2023–02, 2023.
- Masked Autoencoders As Spatiotemporal Learners. arXiv, 2022.
- Multimodal masked autoencoders learn transferable representations. In First Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward at ICML 2022, 2022.
- The reactome pathway knowledgebase 2022. Nucleic Acids Research, 50(D1):D687–D692, 2021.
- CORUM: the comprehensive resource of mammalian protein complexes—2019. Nucleic Acids Research, 47(Database issue):D559–D563, 2019.
- Metadata-guided Consistency Learning for High Content Images. arXiv, 2022.
- Dilated neighborhood attention transformer. arXiv preprint arXiv:2209.15001, 2022.
- Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16000–16009, 2022.
- Deep learning scaling is predictable, empirically. arXiv preprint arXiv:1712.00409, 2017.
- Train longer, generalize better: closing the generalization gap in large batch training of neural networks. Advances in neural information processing systems, 30, 2017.
- MAViL: Masked Audio-Video Learners. arXiv, 2022a.
- Masked Autoencoders that Listen. arXiv, 2022b.
- Recognizing and avoiding siRNA off-target effects for target identification and therapeutic application. Nature Reviews Drug Discovery, 9(1):57–67, 2010.
- Self-supervision advances morphological profiling by unlocking powerful image representations. bioRxiv, pages 2023–04, 2023.
- Big Transfer (BiT): General Visual Representation Learning. arXiv, 2019.
- Masked autoencoders are scalable learners of cellular morphology. arXiv preprint arXiv:2309.16064, 2023.
- Classifying and segmenting microscopy images with deep multiple instance learning. Bioinformatics, 32(12):i52–i59, 2016.
- Automated analysis of high-content microscopy data with deep learning. Molecular Systems Biology, 13(4):924, 2017.
- ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6):84–90, 2017.
- High-resolution genome-wide mapping of chromosome-arm-scale truncations induced by crispr-cas9 editing. bioRxiv, pages 2023–04, 2023.
- Adaptive batch normalization for practical domain adaptation. Pattern Recognition, 80:109–117, 2018.
- Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
- Swin transformer v2: Scaling up capacity and resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12009–12019, 2022.
- Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
- Deep learning for cellular image analysis. Nature Methods, 16(12):1233–1246, 2019.
- Learning representations for image-based profiling of perturbations. bioRxiv, page 2022.08.12.503783, 2022.
- OpenAI. Gpt-4 technical report, 2023.
- Analysis of the Human Protein Atlas Image Classification competition. Nature Methods, 16(12):1254–1261, 2019.
- Automating Morphological Profiling with Generic Deep Convolutional Networks. bioRxiv, page 085118, 2016.
- A new era in functional genomics screens. Nature Reviews Genetics, 23(2):89–103, 2022.
- CNN Features Off-the-Shelf: An Astounding Baseline for Recognition. 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 512–519, 2014.
- Imagenet-21k pretraining for the masses. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1), 2021.
- U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
- Deemd: Drug efficacy estimation against sars-cov-2 based on cell morphology with deep multiple instance learning. IEEE Transactions on Medical Imaging, 41(11):3128–3145, 2022.
- A Pooled Cell Painting CRISPR Screening Platform Enables de novo Inference of Gene Function by Self-supervised Deep Learning. bioRxiv, pages 2023–08, 2023.
- How to train your vit? data, augmentation, and regularization in vision transformers. arXiv preprint arXiv:2106.10270, 2021.
- CellProfiler 4: improvements in speed, utility and usability. BMC Bioinformatics, 22(1):433, 2021.
- Cellpose: a generalist algorithm for cellular segmentation. Nature Methods, 18(1):100–106, 2021.
- Rxrx1: A dataset for evaluating experimental batch correction methods. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4284–4293, 2023.
- The STRING database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Research, 49(D1):D605–D612, 2020.
- Three things everyone should know about vision transformers. In European Conference on Computer Vision, pages 497–515. Springer, 2022.
- Deep Learning Automates the Quantitative Analysis of Individual Cells in Live-Cell Imaging Experiments. PLoS Computational Biology, 12(11):e1005177, 2016.
- Phenotypic drug discovery: recent successes, lessons learned and new directions. Nature Reviews Drug Discovery, 21(12):899–914, 2022.
- Masked frequency modeling for self-supervised visual pre-training. In The Eleventh International Conference on Learning Representations, 2022.
- Microsnoop: a generalist tool for the unbiased representation of heterogeneous microscopy images. bioRxiv, pages 2023–02, 2023.
- Scaling vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12104–12113, 2022.
- A survey on masked autoencoder for self-supervised learning in vision and beyond. arXiv preprint arXiv:2208.00173, 2022.
- Zhi-Hua Zhou. A brief introduction to weakly supervised learning. National science review, 5(1):44–53, 2018.