CausalBench: A Large-scale Benchmark for Network Inference from Single-cell Perturbation Data (2210.17283v2)
Abstract: Causal inference is a vital aspect of multiple scientific disciplines and is routinely applied to high-impact applications such as medicine. However, evaluating the performance of causal inference methods in real-world environments is challenging due to the need for observations under both interventional and control conditions. Traditional evaluations conducted on synthetic datasets do not reflect the performance in real-world systems. To address this, we introduce CausalBench, a benchmark suite for evaluating network inference methods on real-world interventional data from large-scale single-cell perturbation experiments. CausalBench incorporates biologically-motivated performance metrics, including new distribution-based interventional metrics. A systematic evaluation of state-of-the-art causal inference methods using our CausalBench suite highlights how poor scalability of current methods limits performance. Moreover, methods that use interventional information do not outperform those that only use observational data, contrary to what is observed on synthetic benchmarks. Thus, CausalBench opens new avenues in causal network inference research and provides a principled and reliable way to track progress in leveraging real-world interventional data.
- Scenic: single-cell regulatory network inference and clustering. Nature methods, 14(11):1083–1086, 2017.
- K. Akers and T. Murali. Gene regulatory network inference in single-cell biology. Current Opinion in Systems Biology, 26:87–97, 2021.
- A.-L. Barabási and R. Albert. Emergence of scaling in random networks. science, 286(5439):509–512, 1999.
- N. Baum-Snow and F. Ferreira. Causal inference in urban and regional economics. In Handbook of regional and urban economics, volume 5, pages 3–68. Elsevier, 2015.
- Differentiable causal discovery from interventional data. Advances in Neural Information Processing Systems, 33:21865–21877, 2020.
- A. D. Bucchianico. Combinatorics, computer algebra, and the Wilcoxon-Mann-Whitney test”. Journal of Statistical Planning and Inference, 79:349–364, 1999.
- R. W. Carthew. Gene regulation by micrornas. Current opinion in genetics & development, 16(2):203–208, 2006.
- A review on the computational approaches for gene regulatory network construction. Computers in biology and medicine, 48:55–65, 2014.
- D. M. Chickering. Optimal structure identification with greedy search. Journal of machine learning research, 3(Nov):507–554, 2002.
- Pooled crispr screening with single-cell transcriptome readout. Nature methods, 14(3):297–301, 2017.
- Ultra-high-throughput single-cell rna sequencing and perturbation screening with combinatorial fluidic indexing. Nature methods, 18(6):635–642, 2021.
- The encyclopedia of dna elements (encode): data portal update. Nucleic acids research, 46(D1):D794–D801, 2018.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
- P. Dibaeinia and S. Sinha. Sergio: a single-cell expression simulator guided by gene regulatory networks. Cell systems, 11(3):252–271, 2020.
- Perturb-seq: dissecting molecular circuits with scalable single-cell rna profiling of pooled genetic screens. cell, 167(7):1853–1866, 2016.
- On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci, 5(1):17–60, 1960.
- Application of causal inference methods in the analyses of randomised controlled trials: a systematic review. Trials, 19:1–14, 2018.
- String v9. 1: protein-protein interaction networks, with increased coverage and integration. Nucleic acids research, 41(D1):D808–D815, 2012.
- Svd-phy: improved prediction of protein functional associations through singular value decomposition of phylogenetic profiles. Bioinformatics, 32(7):1085–1087, 2016.
- Using bayesian networks to analyze expression data. Journal of computational biology, 7(3-4):601–620, 2000.
- The case for evaluating causal models using interventional measures and empirical data. Advances in Neural Information Processing Systems, 32, 2019.
- Bias tradeoffs in the creation and analysis of protein–protein interaction networks. Journal of proteomics, 100:44–54, 2014.
- Corum: the comprehensive resource of mammalian protein complexes—2019. Nucleic acids research, 47(D1):D559–D563, 2019.
- A. Hauser and P. Bühlmann. Characterization and greedy learning of interventional markov equivalence classes of directed acyclic graphs. The Journal of Machine Learning Research, 13(1):2409–2464, 2012.
- Integration of single-cell multi-omics for gene regulatory network inference. Computational and Structural Biotechnology Journal, 18:1925–1938, 2020.
- Generalized score functions for causal discovery. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 1551–1560, 2018.
- Inferring regulatory networks from expression data using tree-based methods. PloS one, 5(9):e12776, 2010.
- String 8—a global view on proteins and their functional interactions in 630 organisms. Nucleic acids research, 37(suppl_1):D412–D416, 2009.
- Causal diagrams in systems epidemiology. Emerging themes in epidemiology, 9(1):1–18, 2012.
- Celloracle: Dissecting cell identity via network inference and in silico gene perturbation. BioRxiv, 2020.
- Size of interventional markov equivalence classes in random dag models. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 3234–3243. PMLR, 2019.
- Learning neural causal models from unknown interventions. arXiv preprint arXiv:1910.01075, 2019.
- Single-cell rna-seq reveals changes in cell cycle and differentiation programs upon aging of hematopoietic stem cells. Genome research, 25(12):1860–1872, 2015.
- Crispr interference (crispri) for sequence-specific control of gene expression. Nature protocols, 8(11):2180–2196, 2013.
- M. Levine and E. H. Davidson. Gene regulatory networks for development. Proceedings of the National Academy of Sciences, 102(14):4936–4942, 2005.
- Large-scale differentiable causal discovery of factor graphs. arXiv preprint arXiv:2206.07824, 2022.
- Amortized inference for causal structure learning. arXiv preprint arXiv:2205.12934, 2022.
- On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics, pages 50–60, 1947.
- GeneDisco: A Benchmark for Experimental Design in Drug Discovery. In International Conference on Learning Representations, 2022.
- Methods for causal inference from gene perturbation experiments and validation. Proceedings of the National Academy of Sciences, 113(27):7361–7368, 2016.
- String: a database of predicted functional associations between proteins. Nucleic acids research, 31(1):258–261, 2003.
- Realcause: Realistic causal inference benchmarking. arXiv preprint arXiv:2011.15007, 2020.
- The support of human genetic evidence for approved drug indications. Nature genetics, 47(8):856–860, 2015.
- Validating causal inference methods. In International Conference on Machine Learning, pages 17346–17358. PMLR, 2022.
- J. Pearl. Causality. Cambridge university press, 2009.
- scperturb: Harmonized single-cell perturbation data. bioRxiv, pages 2022–08, 2022.
- J. Peters and P. Bühlmann. Structural intervention distance for evaluating causal graphs. Neural computation, 27(3):771–799, 2015.
- Elements of causal inference: foundations and learning algorithms. The MIT Press, 2017.
- Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nature methods, 17(2):147–154, 2020.
- On wasserstein two-sample testing and related families of nonparametric tests. Entropy, 19(2):47, 2017.
- Beware of the simulated dag! causal discovery benchmarks may be easy to game. Advances in Neural Information Processing Systems, 34:27772–27784, 2021.
- Mapping information-rich genotype-phenotype landscapes with genome-scale perturb-seq. Cell, 2022.
- Learning neural causal models with active interventions. arXiv preprint arXiv:2109.02429, 2021.
- Toward causal representation learning. Proceedings of the IEEE, 109(5):612–634, 2021.
- Genome-wide crispr screens in primary human t cells reveal key regulators of immune function. Cell, 175(7):1958–1971, 2018.
- Benchmarking framework for performance-evaluation of causal inference analysis. arXiv preprint arXiv:1802.05046, 2018.
- False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological science, 22(11):1359–1366, 2011.
- String: a web-server to retrieve and display the repeatedly occurring neighbourhood of a gene. Nucleic acids research, 28(18):3442–3444, 2000.
- Causation, prediction, and search. MIT press, 2000.
- The string database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic acids research, 39(suppl_1):D561–D568, 2010.
- String v10: protein–protein interaction networks, integrated over the tree of life. Nucleic acids research, 43(D1):D447–D452, 2015.
- The string database in 2017: quality-controlled protein–protein association networks, made broadly accessible. Nucleic acids research, page gkw937, 2016.
- String v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic acids research, 47(D1):D607–D613, 2019.
- The string database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic acids research, 49(D1):D605–D612, 2021.
- Rescue: imputing dropout events in single-cell rna-sequencing data. BMC bioinformatics, 20(1):1–11, 2019.
- SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272, 2020. doi: 10.1038/s41592-019-0686-2.
- String: known and predicted protein–protein associations, integrated and transferred across organisms. Nucleic acids research, 33(suppl_1):D433–D437, 2005.
- String 7—recent developments in the integration and prediction of protein interactions. Nucleic acids research, 35(suppl_1):D358–D362, 2007.
- Permutation-based causal inference algorithms with interventions. Advances in Neural Information Processing Systems, 30, 2017.
- Advances to bayesian network inference for generating causal networks from observational biological data. Bioinformatics, 20(18):3594–3603, 2004.
- DAGs with NO TEARS: Continuous Optimization for Structure Learning. In Advances in Neural Information Processing Systems, 2018.
- Learning sparse nonparametric DAGs. In International Conference on Artificial Intelligence and Statistics, 2020.
- Chip-atlas 2021 update: a data-mining suite for exploring epigenomic landscapes by fully integrating chip-seq, atac-seq and bisulfite-seq data. Nucleic Acids Research, page 1, 2022.