Interpretable deep learning in single-cell omics (2401.06823v1)
Abstract: Recent developments in single-cell omics technologies have enabled the quantification of molecular profiles in individual cells at an unparalleled resolution. Deep learning, a rapidly evolving sub-field of machine learning, has instilled a significant interest in single-cell omics research due to its remarkable success in analysing heterogeneous high-dimensional single-cell omics data. Nevertheless, the inherent multi-layer nonlinear architecture of deep learning models often makes them `black boxes' as the reasoning behind predictions is often unknown and not transparent to the user. This has stimulated an increasing body of research for addressing the lack of interpretability in deep learning models, especially in single-cell omics data analyses, where the identification and understanding of molecular regulators are crucial for interpreting model predictions and directing downstream experimental validations. In this work, we introduce the basics of single-cell omics technologies and the concept of interpretable deep learning. This is followed by a review of the recent interpretable deep learning models applied to various single-cell omics research. Lastly, we highlight the current limitations and discuss potential future directions. We anticipate this review to bring together the single-cell and machine learning research communities to foster future development and application of interpretable deep learning in single-cell omics research.
- Interpretable machine learning for discovery: Statistical challenges and opportunities. Annual Review of Statistics and Its Application, 11, 2023.
- Deepimpute: an accurate, fast, and scalable deep neural network method to impute single-cell rna-seq data. Genome Biology, 20(1):1–14, 2019.
- Gene regulatory network inference in the era of single-cell multi-omics. Nature Reviews Genetics, pages 1–16, 2023.
- The technological landscape and applications of single-cell multi-omics. Nature Reviews Molecular Cell Biology, pages 1–19, 2023.
- Single-cell chromatin accessibility reveals principles of regulatory variation. Nature, 523(7561):486–490, 2015.
- Interpretable and context-free deconvolution of multi-scale whole transcriptomic data with unicell deconvolve. Nature Communications, 14(1):1350, 2023.
- Transformer interpretability beyond attention visualization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 782–791, 2021.
- Transformer for one stop interpretable cell type annotation. Nature Communications, 14(1):223, 2023.
- Deep autoencoder for interpretable tissue-adaptive deconvolution and cell-type-specific gene analysis. Nature Communications, 13(1):6735, 2022.
- sivae: interpretable deep generative models for single-cell transcriptomes. Genome Biology, 24(1):29, 2023.
- scnmt-seq enables joint profiling of chromatin accessibility dna methylation and transcription in single cells. Nature Communications, 9(1):781, 2018.
- The tabula sapiens: A multiple-organ, single-cell transcriptomic atlas of humans. Science, 376(6594):eabl4896, 2022.
- Interpretable dimensionality reduction of single cell transcriptome data with deep generative models. Nature Communications, 9(1):2002, 2018.
- F. Doshi-Velez and B. Kim. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608, 2017.
- N. Fortelny and C. Bock. Knowledge-primed neural networks enable biologically interpretable deep learning on single-cell sequencing data. Genome Biology, 21(1):1–36, 2020.
- Joint probabilistic modeling of single-cell multi-omic data with totalvi. Nature Methods, 18(3):272–282, 2021.
- Interpretable single-cell transcription factor prediction based on deep learning with attention mechanism. Computational Biology and Chemistry, 106:107923, 2023.
- Pmvae: Learning interpretable single-cell representations with pathway modules. bioRxiv, pages 2021–01, 2021.
- Multi-omics approaches to disease. Genome Biology, 18(1):1–15, 2017.
- Predicting the impact of sequence motifs on gene regulation using single-cell data. Genome Biology, 24(1):189, 2023.
- Best practices for single-cell analysis across modalities. Nature Reviews Genetics, pages 1–23, 2023.
- Simultaneous profiling of transcriptome and dna methylome from a single cell. Genome Biology, 17(1):1–11, 2016.
- A robust and interpretable end-to-end deep learning model for cytometry data. Proceedings of the National Academy of Sciences, 117(35):21373–21380, 2020.
- Evaluation of deep learning-based feature selection for single-cell rna sequencing data analysis. Genome Biology, 24(1):259, 2023.
- Single-cell transcriptomics of 20 mouse organs creates a tabula muris. Nature, 562(7727):367–372, 2018.
- Pause: principled feature attribution for unsupervised gene expression analysis. Genome Biology, 24(1):81, 2023.
- L. Kester and A. Van Oudenaarden. Single-cell transcriptomics meets lineage tracing. Cell stem cell, 23(2):166–179, 2018.
- Single-cell gene regulatory network prediction by explainable ai. Nucleic Acids Research, 51(4):e20–e20, 2023.
- Gene regulatory network reconstruction: harnessing the power of single-cell multi-omic data. NPJ Systems Biology and Applications, 9(1):51, 2023.
- Q. Li. sctour: a deep learning architecture for robust inference and accurate prediction of cellular dynamics. Genome Biology, 24(1):1–33, 2023.
- Deep learning enables accurate clustering with batch effect removal in single-cell rna-seq analysis. Nature Communications, 11(1):2338, 2020.
- Z. C. Lipton. The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue, 16(3):31–57, 2018.
- Multi-task learning from multimodal single-cell omics with matilda. Nucleic Acids Research, 51(8):e45–e45, 2023.
- Biologically informed deep learning to query gene programs in single-cell atlases. Nature Cell Biology, 25(2):337–350, 2023.
- A unified approach to interpreting model predictions. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
- Single-cell biological network inference using a heterogeneous graph transformer. Nature Communications, 14(1):964, 2023.
- Q. Ma and D. Xu. Deep learning shapes single-cell data analysis. Nature Reviews Molecular Cell Biology, 23(5):303–304, 2022.
- Chromatin potential identified by shared single-cell profiling of rna and chromatin. Cell, 183(4):1103–1116, 2020.
- A mixture-of-experts deep generative model for integrated analysis of single-cell multiomics data. Cell Reports Methods, 1(5), 2021.
- Definitions, methods, and applications in interpretable machine learning. Proceedings of the National Academy of Sciences, 116(44):22071–22080, 2019.
- Obtaining genetics insights from deep learning via explainable artificial intelligence. Nature Reviews Genetics, 24(2):125–137, 2023.
- Explainn: interpretable and transparent neural networks for genomics. Genome Biology, 24(1):154, 2023.
- Exploring tissue architecture using spatial transcriptomics. Nature, 596(7871):211–220, 2021.
- ” why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016.
- C. Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5):206–215, 2019.
- Vega is an interpretable generative model for inferring biological network activity in single-cell transcriptomics. Nature communications, 12(1):5684, 2021.
- Single-cell transcriptomics reveals bimodality in expression and splicing in immune cells. Nature, 498(7453):236–240, 2013.
- scdeepsort: a pre-trained cell-type annotation method for single-cell transcriptomics using deep learning with a weighted graph neural network. Nucleic Acids Research, 49(21):e122–e122, 2021.
- Single-cell genome-wide bisulfite sequencing for assessing epigenetic heterogeneity. Nature Methods, 11(8):817–820, 2014.
- Mass cytometry: single cells, many features. Cell, 165(4):780–791, 2016.
- Simultaneous epitope and transcriptome measurement in single cells. Nature Methods, 14(9):865–868, 2017.
- Simultaneous trimodal single-cell measurement of transcripts, epitopes, and chromatin accessibility using tea-seq. eLife, 10:e63632, 2021.
- mrna-seq whole-transcriptome analysis of a single cell. Nature Methods, 6(5):377–382, 2009.
- Explainable multi-task learning for multi-modality biological data analysis. Nature Communications, 14(1):2546, 2023.
- Clustering single-cell rna-seq data with a model-based deep learning approach. Nature Machine Intelligence, 1(4):191–198, 2019.
- Methods and applications for single-cell and spatial multi-omics. Nature Reviews Genetics, pages 1–22, 2023.
- W. J. von Eschenbach. Transparency and the black box problem: Why we do not trust ai. Philosophy & Technology, 34(4):1607–1622, 2021.
- Lineage tracing meets single-cell omics: opportunities and challenges. Nature Reviews Genetics, 21(7):410–427, 2020.
- D. Wang and S. Bodovitz. Single cell analysis: the new frontier in ‘omics’. Trends in Biotechnology, 28(6):281–290, 2010.
- An interpretable deep-learning architecture of capsule networks for identifying cell-type gene expression programs from single-cell rna-sequencing data. Nature Machine Intelligence, 2(11):693–703, 2020.
- Stgrns: an interpretable transformer-based method for inferring gene regulatory networks from single-cell transcriptomic data. Bioinformatics, 39(4):btad165, 2023.
- scbert as a large-scale pretrained deep language model for cell type annotation of single-cell rna-seq data. Nature Machine Intelligence, 4(10):852–866, 2022.
- Feature selection revisited in the single-cell era. Genome Biology, 22:1–17, 2021.
- Ensemble deep learning of embeddings for clustering multimodal single-cell omics data. Bioinformatics, page btad382, 2023.
- Topological identification and interpretation for single-cell gene regulation elucidation across multiple platforms using scmgca. Nature Communications, 14(1):400, 2023.
- A survey on neural network interpretability. IEEE Transactions on Emerging Topics in Computational Intelligence, 5(5):726–742, 2021.
- Manoj M Wagle (1 paper)
- Siqu Long (18 papers)
- Carissa Chen (1 paper)
- Chunlei Liu (34 papers)
- Pengyi Yang (2 papers)