Learning Causally Disentangled Representations via the Principle of Independent Causal Mechanisms (2306.01213v4)
Abstract: Learning disentangled causal representations is a challenging problem that has gained significant attention recently due to its implications for extracting meaningful information for downstream tasks. In this work, we define a new notion of causal disentanglement from the perspective of independent causal mechanisms. We propose ICM-VAE, a framework for learning causally disentangled representations supervised by causally related observed labels. We model causal mechanisms using nonlinear learnable flow-based diffeomorphic functions to map noise variables to latent causal variables. Further, to promote the disentanglement of causal factors, we propose a causal disentanglement prior learned from auxiliary labels and the latent causal structure. We theoretically show the identifiability of causal factors and mechanisms up to permutation and elementwise reparameterization. We empirically demonstrate that our framework induces highly disentangled causal factors, improves interventional robustness, and is compatible with counterfactual generation.
- Interventional causal representation learning. In Proceedings of the 40th International Conference on Machine Learning, 2023.
- Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1798–1828, 2013.
- Weakly supervised causal representation learning. In Advances in Neural Information Processing Systems, 2022.
- Pierre Comon. Independent component analysis, a new concept? Signal Processing, 36(3):287–314, 1994. Higher Order Statistics.
- Cian Eastwood and Christopher K. I. Williams. A framework for the quantitative evaluation of disentangled representations. In International Conference on Learning Representations, 2018.
- DCI-ES: An extended disentanglement framework with connections to identifiability. In The Eleventh International Conference on Learning Representations, 2023.
- Review of causal discovery methods based on graphical models. Frontiers in Genetics, 2019.
- beta-VAE: Learning basic visual concepts with a constrained variational framework. In International Conference on Learning Representations, 2017.
- Nonlinear ica using auxiliary variables and generalized contrastive learning. arXiv preprint arXiv:1805.08651, 2018.
- Nonlinear independent component analysis: Existence and uniqueness results. Neural Networks, 12(3):429–439, 1999.
- Diva: Domain invariant variational autoencoders. In Medical Imaging with Deep Learning, 2020.
- Capturing label characteristics in vaes. In International Conference on Learning Representations, 2021.
- Variational autoencoders and nonlinear ica: A unifying framework. In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, 2020.
- Causal autoregressive flows. In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, 2021.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
- Scm-vae: Learning identifiable causal representations via structural knowledge. In IEEE International Conference on Big Data (Big Data), 2022.
- Gradient-based neural dag learning. In International Conference on Learning Representations, 2020.
- Disentanglement via mechanism sparsity regularization: A new principle for nonlinear ICA. In First Conference on Causal Learning and Reasoning, 2022.
- Causal component analysis. In Advances in Neural Information Processing Systems, 2023.
- Causal representation learning for instantaneous and temporal effects in interactive systems. In The Eleventh International Conference on Learning Representations, 2023.
- Deep learning face attributes in the wild. In 2015 IEEE International Conference on Computer Vision (ICCV), pages 3730–3738, 2015.
- Challenging common assumptions in the unsupervised learning of disentangled representations. In Proceedings of the 36th International Conference on Machine Learning, 2019.
- Weakly-supervised disentanglement without compromises. In Proceedings of the 37th International Conference on Machine Learning, 2020.
- Counterfactual identifiability of bijective causal models. In Proceedings of the 40th International Conference on Machine Learning, 2023.
- Normalizing flows for probabilistic modeling and inference. Journal of Machine Learning Research, 22(1), 2021.
- Learning independent causal mechanisms. In Proceedings of the 35th International Conference on Machine Learning, 2018.
- Judea Pearl. Causality. Cambridge University Press, Cambridge, UK, 2 edition, 2009.
- Elements of Causal Inference: Foundations and Learning Algorithms. The MIT Press, 2017.
- Jacobian-based causal discovery with nonlinear ICA. Transactions on Machine Learning Research, 2023.
- Toward Causal Representation Learning. Proceedings of the IEEE, 109(5):612–634, May 2021.
- Bernhard Schölkopf and Julius von Kügelgen. From statistical to causal learning. arXiv preprint arXiv:2204.00607, 2022.
- Weakly supervised disentangled generative causal representation learning. Journal of Machine Learning Research, 23(241):1–55, 2022.
- Robustly disentangled causal mechanisms: Validating deep representations for interventional robustness. In Proceedings of the 36th International Conference on Machine Learning, 2019.
- Discrete flows: Invertible generative models of discrete data. In Advances in Neural Information Processing Systems, 2019.
- On disentangled representations learned from correlated data. In Proceedings of the 38th International Conference on Machine Learning, 2021.
- Self-supervised learning with data augmentations provably isolates content from style. In Advances in Neural Information Processing Systems, 2021.
- D’ya like dags? a survey on structure learning and causal discovery. arXiv preprint arXiv:2103.02582, 2021.
- Causalvae: Disentangled representation learning via neural structural causal models. In IEEE Conference on Computer Vision and Pattern Recognition, 2021.
- Dags with no tears: Continuous optimization for structure learning. In Advances in Neural Information Processing Systems, 2018.