Identifiable Latent Neural Causal Models (2403.15711v1)
Abstract: Causal representation learning seeks to uncover latent, high-level causal representations from low-level observed data. It is particularly good at predictions under unseen distribution shifts, because these shifts can generally be interpreted as consequences of interventions. Hence leveraging {seen} distribution shifts becomes a natural strategy to help identifying causal representations, which in turn benefits predictions where distributions are previously {unseen}. Determining the types (or conditions) of such distribution shifts that do contribute to the identifiability of causal representations is critical. This work establishes a {sufficient} and {necessary} condition characterizing the types of distribution shifts for identifiability in the context of latent additive noise models. Furthermore, we present partial identifiability results when only a portion of distribution shifts meets the condition. In addition, we extend our findings to latent post-nonlinear causal models. We translate our findings into a practical algorithm, allowing for the acquisition of reliable latent causal representations. Our algorithm, guided by our underlying theory, has demonstrated outstanding performance across a diverse range of synthetic and real-world datasets. The empirical observations align closely with the theoretical findings, affirming the robustness and effectiveness of our approach.
- Identification of partially observed linear causal models: Graphical conditions for the non-gaussian and heterogeneous cases. In NeurIPS, 2021.
- Interventional causal representation learning. In International Conference on Machine Learning, pp. 372–407. PMLR, 2023.
- Learning linear bayesian networks with latent variables. In ICML, pp. 249–257, 2013.
- Weakly supervised causal representation learning. arXiv preprint arXiv:2203.16437, 2022.
- Learning linear causal representations from interventions under general nonlinear mixing. arXiv preprint arXiv:2306.02235, 2023.
- Triad constraints for learning causal structure of latent variables. In NeurIPS, 2019.
- Handling sparsity via the horseshoe. In Artificial intelligence and statistics, pp. 73–80. PMLR, 2009.
- Image-based profiling for drug discovery: due for a machine-learning upgrade? Nature Reviews Drug Discovery, 20(2):145–159, 2021.
- The great recession and mental health in the united states. Clinical Psychological Science, 7(5):900–913, 2019.
- Robust causal structure learning with some hidden variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 81(3):459–487, 2019.
- Multi-domain causal structure learning in linear systems. Advances in neural information processing systems, 31, 2018.
- Adaptive conformal inference under distribution shift. Advances in Neural Information Processing Systems, 34:1660–1672, 2021.
- beta-vae: Learning basic visual concepts with a constrained variational framework. In ICLR, 2017.
- Nonlinear causal discovery with additive noise models. In NeurIPS, volume 21, pp. 689–696. Citeseer, 2008.
- Causal discovery from heterogeneous/nonstationary data. JMLR, 21(89), 2020.
- Latent hierarchical causal structure discovery with rank constraints. Advances in Neural Information Processing Systems, 35:5549–5561, 2022.
- Unsupervised feature extraction by time-contrastive learning and nonlinear ica. Advances in neural information processing systems, 29, 2016.
- Nonlinear ica using auxiliary variables and generalized contrastive learning. In The 22nd International Conference on Artificial Intelligence and Statistics, pp. 859–868. PMLR, 2019.
- Systematic evaluation of causal discovery in visual model based reinforcement learning. arXiv preprint arXiv:2107.00848, 2021.
- Variational autoencoders and nonlinear ica: A unifying framework. In AISTAS, pp. 2207–2217. PMLR, 2020.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
- Partial disentanglement for domain adaptation. In International Conference on Machine Learning, pp. 11455–11472. PMLR, 2022.
- Disentanglement via mechanism sparsity regularization: A new principle for nonlinear ica. arXiv preprint arXiv:2107.10098, 2021.
- URL https://openfmri.org/dataset/ds000031/.
- Causal representation learning for instantaneous and temporal effects in interactive systems. In The Eleventh International Conference on Learning Representations, 2022a.
- Citris: Causal identifiability from temporal intervened sequences. In International Conference on Machine Learning, pp. 13557–13603. PMLR, 2022b.
- Variational bayesian dropout with a hierarchical prior. In CVPR, 2019.
- Identifying weight-variant latent causal models. arXiv preprint arXiv:2208.14153, 2022.
- Identifiable latent polynomial causal models through the lens of change. arXiv preprint arXiv:2310.15580, 2023.
- Pearl, J. Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge, 2000.
- Causal discovery with continuous additive noise models. JMLR, 15(58):2009–2053, 2014.
- Elements of Causal Inference: Foundations and Learning Algorithms. The MIT Press, 2017.
- Climate-driven shifts in marine species ranges: Scaling from organisms to communities. Annual review of marine science, 12:153–179, 2020.
- Toward causal representation learning. Proceedings of the IEEE, 109(5):612–634, 2021.
- Linear causal disentanglement via interventions. arXiv preprint arXiv:2211.16467, 2022.
- Estimation of linear non-gaussian acyclic models for latent factors. Neurocomputing, 72(7-9):2024–2027, 2009.
- Learning the structure of linear latent variable models. JMLR, 7(2), 2006.
- Disentanglement by nonlinear ica with general incompressible-flow networks (gin). arXiv preprint arXiv:2001.04872, 2020.
- Causation, Prediction, and Search. MIT Press, Cambridge, MA, 2nd edition, 2001.
- Score-based causal representation learning with interventions. arXiv preprint arXiv:2301.08230, 2023.
- Self-supervised learning with data augmentations provably isolates content from style. In Advances in neural information processing systems, 2021.
- Causal balancing for domain generalization. arXiv preprint arXiv:2206.05263, 2022.
- Generalized independent noise condition for estimating latent variable causal graphs. In NeurIPS, 2020.
- Identification of linear non-gaussian latent hierarchical structure. In International Conference on Machine Learning, pp. 24370–24387. PMLR, 2022a.
- Multi-domain image generation and translation with identifiability guarantees. In The Eleventh International Conference on Learning Representations, 2022b.
- Learning temporally causal latent processes from general temporal data. arXiv preprint arXiv:2110.05428, 2021.
- Learning latent causal dynamics. arXiv preprint arXiv:2202.04828, 2022.
- Identifiability guarantees for causal disentanglement from soft interventions. arXiv preprint arXiv:2307.06250, 2023.
- On the identifiability of the post-nonlinear causal model. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, Montreal, Canada, 2009.