Masked Autoencoders are Scalable Learners of Cellular Morphology (2309.16064v2)
Abstract: Inferring biological relationships from cellular phenotypes in high-content microscopy screens provides significant opportunity and challenge in biological research. Prior results have shown that deep vision models can capture biological signal better than hand-crafted features. This work explores how self-supervised deep learning approaches scale when training larger models on larger microscopy datasets. Our results show that both CNN- and ViT-based masked autoencoders significantly outperform weakly supervised baselines. At the high-end of our scale, a ViT-L/8 trained on over 3.5-billion unique crops sampled from 93-million microscopy images achieves relative improvements as high as 28% over our best weakly supervised baseline at inferring known biological relationships curated from public databases. Relevant code and select models released with this work can be found at: https://github.com/recursionpharma/maes_microscopy.
- Improving Phenotypic Measurements in High-Content Imaging Screens. bioRxiv, page 161422, 2017. doi: 10.1101/161422.
- A Cookbook of Self-Supervised Learning. arXiv, 2023. doi: 10.48550/arxiv.2304.12210.
- Microscopy-Based High-Content Screening. Cell, 163(6):1314–1325, 2015. ISSN 0092-8674. doi: 10.1016/j.cell.2015.11.007.
- Data-analysis strategies for image-based cell profiling. Nature Methods, 14(9):849–863, 2017. ISSN 1548-7091. doi: 10.1038/nmeth.4397.
- Emerging Properties in Self-Supervised Vision Transformers. arXiv, 2021.
- CellProfiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biology, 7(10):R100, 2006. ISSN 1465-6906. doi: 10.1186/gb-2006-7-10-r100.
- Biological cartography: Building and benchmarking representations of life. In NeurIPS 2022 Workshop on Learning Meaningful Representations of Life, 2022.
- Image-based profiling for drug discovery: due for a machine-learning upgrade? Nature Reviews Drug Discovery, 20(2):145–159, 2021. ISSN 1474-1776. doi: 10.1038/s41573-020-00117-w.
- Jump cell painting dataset: morphological impact of 136,000 chemical and genetic perturbations. bioRxiv, pages 2023–03, 2023.
- A Simple Framework for Contrastive Learning of Visual Representations. arXiv, 2020.
- Symbolic discovery of optimization algorithms. arXiv preprint arXiv:2302.06675, 2023.
- Self-Supervised Learning of Phenotypic Representations from Cell Images with Weak Labels. arXiv, 2022. doi: 10.48550/arxiv.2209.07819.
- Flashattention: Fast and memory-efficient exact attention with io-awareness. Advances in Neural Information Processing Systems, 35:16344–16359, 2022.
- Scaling vision transformers to 22 billion parameters. In International Conference on Machine Learning, pages 7480–7512. PMLR, 2023.
- Unbiased single-cell morphology with self-supervised vision transformers. bioRxiv, pages 2023–06, 2023.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR), 2020.
- Integration of over 9,000 mass spectrometry experiments builds a global map of human protein complexes. Molecular Systems Biology, 13(6):932, 2017. ISSN 1744-4292. doi: 10.15252/msb.20167490.
- Reconstructing cell cycle and disease progression using deep learning. Nature Communications, 8(1):463, 2017. doi: 10.1038/s41467-017-00623-3.
- Rxrx3: Phenomics map of biology. bioRxiv, pages 2023–02, 2023.
- Masked Autoencoders As Spatiotemporal Learners. arXiv, 2022. doi: 10.48550/arxiv.2205.09113.
- The reactome pathway knowledgebase 2022. Nucleic Acids Research, 50(D1):D687–D692, 2021. ISSN 0305-1048. doi: 10.1093/nar/gkab1028.
- CORUM: the comprehensive resource of mammalian protein complexes—2019. Nucleic Acids Research, 47(Database issue):D559–D563, 2019. ISSN 0305-1048. doi: 10.1093/nar/gky973.
- Metadata-guided Consistency Learning for High Content Images. arXiv, 2022. doi: 10.48550/arxiv.2212.11595.
- Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16000–16009, 2022.
- Deep learning scaling is predictable, empirically. arXiv preprint arXiv:1712.00409, 2017.
- Train longer, generalize better: closing the generalization gap in large batch training of neural networks. Advances in neural information processing systems, 30, 2017.
- MAViL: Masked Audio-Video Learners. arXiv, 2022a. doi: 10.48550/arxiv.2212.08071.
- Masked Autoencoders that Listen. arXiv, 2022b. doi: 10.48550/arxiv.2207.06405.
- Self-supervision advances morphological profiling by unlocking powerful image representations. bioRxiv, pages 2023–04, 2023.
- Big Transfer (BiT): General Visual Representation Learning. arXiv, 2019. doi: 10.48550/arxiv.1912.11370.
- Classifying and segmenting microscopy images with deep multiple instance learning. Bioinformatics, 32(12):i52–i59, 2016. ISSN 1367-4803. doi: 10.1093/bioinformatics/btw252.
- Automated analysis of high-content microscopy data with deep learning. Molecular Systems Biology, 13(4):924, 2017. ISSN 1744-4292. doi: 10.15252/msb.20177551.
- High-resolution genome-wide mapping of chromosome-arm-scale truncations induced by crispr-cas9 editing. bioRxiv, pages 2023–04, 2023.
- Adaptive batch normalization for practical domain adaptation. Pattern Recognition, 80:109–117, 2018.
- Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
- Deep learning for cellular image analysis. Nature Methods, 16(12):1233–1246, 2019. ISSN 1548-7091. doi: 10.1038/s41592-019-0403-1.
- Learning representations for image-based profiling of perturbations. bioRxiv, page 2022.08.12.503783, 2022. doi: 10.1101/2022.08.12.503783.
- OpenAI. Gpt-4 technical report, 2023.
- Analysis of the Human Protein Atlas Image Classification competition. Nature Methods, 16(12):1254–1261, 2019. ISSN 1548-7091. doi: 10.1038/s41592-019-0658-6.
- Automating Morphological Profiling with Generic Deep Convolutional Networks. bioRxiv, page 085118, 2016. doi: 10.1101/085118.
- A new era in functional genomics screens. Nature Reviews Genetics, 23(2):89–103, 2022. ISSN 1471-0056. doi: 10.1038/s41576-021-00409-w.
- CNN Features Off-the-Shelf: An Astounding Baseline for Recognition. 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 512–519, 2014. doi: 10.1109/cvprw.2014.131.
- U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
- Deemd: Drug efficacy estimation against sars-cov-2 based on cell morphology with deep multiple instance learning. IEEE Transactions on Medical Imaging, 41(11):3128–3145, 2022.
- A Pooled Cell Painting CRISPR Screening Platform Enables de novo Inference of Gene Function by Self-supervised Deep Learning. bioRxiv, pages 2023–08, 2023. doi: 10.1101/2023.08.13.553051.
- How to train your vit? data, augmentation, and regularization in vision transformers. arXiv preprint arXiv:2106.10270, 2021.
- CellProfiler 4: improvements in speed, utility and usability. BMC Bioinformatics, 22(1):433, 2021. doi: 10.1186/s12859-021-04344-9.
- Cellpose: a generalist algorithm for cellular segmentation. Nature Methods, 18(1):100–106, 2021. ISSN 1548-7091. doi: 10.1038/s41592-020-01018-x.
- Rxrx1: A dataset for evaluating experimental batch correction methods. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4284–4293, 2023.
- The STRING database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Research, 49(D1):D605–D612, 2020. ISSN 0305-1048. doi: 10.1093/nar/gkaa1074.
- Three things everyone should know about vision transformers. In European Conference on Computer Vision, pages 497–515. Springer, 2022.
- Deep Learning Automates the Quantitative Analysis of Individual Cells in Live-Cell Imaging Experiments. PLoS Computational Biology, 12(11):e1005177, 2016. ISSN 1553-734X. doi: 10.1371/journal.pcbi.1005177.
- Phenotypic drug discovery: recent successes, lessons learned and new directions. Nature Reviews Drug Discovery, 21(12):899–914, 2022. ISSN 1474-1776. doi: 10.1038/s41573-022-00472-w.
- Masked frequency modeling for self-supervised visual pre-training. In The Eleventh International Conference on Learning Representations, 2022.
- Microsnoop: a generalist tool for the unbiased representation of heterogeneous microscopy images. bioRxiv, pages 2023–02, 2023.
- Scaling vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12104–12113, 2022.
- Zhi-Hua Zhou. A brief introduction to weakly supervised learning. National science review, 5(1):44–53, 2018.