Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Causal Representation Learning from Multiple Distributions: A General Setting (2402.05052v3)

Published 7 Feb 2024 in cs.LG and stat.ML

Abstract: In many problems, the measured variables (e.g., image pixels) are just mathematical functions of the latent causal variables (e.g., the underlying concepts or objects). For the purpose of making predictions in changing environments or making proper changes to the system, it is helpful to recover the latent causal variables $Z_i$ and their causal relations represented by graph $\mathcal{G}_Z$. This problem has recently been known as causal representation learning. This paper is concerned with a general, completely nonparametric setting of causal representation learning from multiple distributions (arising from heterogeneous data or nonstationary time series), without assuming hard interventions behind distribution changes. We aim to develop general solutions in this fundamental case; as a by product, this helps see the unique benefit offered by other assumptions such as parametric causal models or hard interventions. We show that under the sparsity constraint on the recovered graph over the latent variables and suitable sufficient change conditions on the causal influences, interestingly, one can recover the moralized graph of the underlying directed acyclic graph, and the recovered latent variables and their relations are related to the underlying causal model in a specific, nontrivial way. In some cases, most latent variables can even be recovered up to component-wise transformations. Experimental results verify our theoretical claims.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Identification of partially observed linear causal models: Graphical conditions for the non-gaussian and heterogeneous cases. Advances in Neural Information Processing Systems, 34:22822–22833, 2021.
  2. Interventional causal representation learning. In International Conference on Machine Learning, pp.  372–407. PMLR, 2023.
  3. Weakly supervised causal representation learning. Advances in Neural Information Processing Systems, 35:38319–38331, 2022.
  4. Function classes for identifiable nonlinear independent component analysis. arXiv preprint arXiv:2208.06406, 2022.
  5. Learning linear causal representations from interventions under general nonlinear mixing. arXiv preprint arXiv:2306.02235, 2023.
  6. Triad constraints for learning causal structure of latent variables. Advances in neural information processing systems, 32, 2019.
  7. Comon, P. Independent component analysis – a new concept? Signal Processing, 36:287–314, 1994.
  8. Hidden markov nonlinear ICA: Unsupervised learning from nonstationary time series. In Conference on Uncertainty in Artificial Intelligence, pp.  939–948. PMLR, 2020.
  9. Disentangling identifiable features from noisy data with structured nonlinear ICA. Advances in Neural Information Processing Systems, 34, 2021.
  10. Latent hierarchical causal structure discovery with rank constraints. Advances in Neural Information Processing Systems, 35:5549–5561, 2022.
  11. Unsupervised feature extraction by time-contrastive learning and nonlinear ICA. Advances in Neural Information Processing Systems, 29:3765–3773, 2016.
  12. Nonlinear ICA of temporally dependent stationary sources. In International Conference on Artificial Intelligence and Statistics, pp.  460–469. PMLR, 2017.
  13. Nonlinear independent component analysis: Existence and uniqueness results. Neural networks, 12(3):429–439, 1999.
  14. A fast algorithm for estimating overcomplete ICA bases for image windows. In Proc. Int. Joint Conf. on Neural Networks, pp.  894–899, Washington, D.C., 1999.
  15. Nonlinear ICA using auxiliary variables and generalized contrastive learning. In International Conference on Artificial Intelligence and Statistics, pp.  859–868. PMLR, 2019.
  16. Nonlinear independent component analysis for principled disentanglement in unsupervised deep learning. Patterns, 4(10):100844, 2023. ISSN 2666-3899. doi: https://doi.org/10.1016/j.patter.2023.100844. URL https://www.sciencedirect.com/science/article/pii/S2666389923002234.
  17. Learning nonparametric latent causal graphs with unknown interventions. arXiv preprint arXiv:2306.02899, 2023.
  18. Variational autoencoders and nonlinear ICA: A unifying framework. In International Conference on Artificial Intelligence and Statistics, pp.  2207–2217. PMLR, 2020a.
  19. Ice-beem: Identifiable conditional energy-based deep models based on nonlinear ica. Advances in Neural Information Processing Systems, 33:12768–12778, 2020b.
  20. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  21. Learning latent causal graphs via mixture oracles. Advances in Neural Information Processing Systems, 34:18087–18101, 2021.
  22. Partial disentanglement for domain adaptation. In International Conference on Machine Learning, pp.  11455–11472. PMLR, 2022.
  23. Disentanglement via mechanism sparsity regularization: A new principle for nonlinear ICA. Conference on Causal Learning and Reasoning, 2022.
  24. Lin, J. Factorizing multivariate function classes. Advances in neural information processing systems, 10, 1997.
  25. High-dimensional learning of linear causal networks via inverse covariance estimation. The Journal of Machine Learning Research, 15(1):3065–3105, 2014.
  26. Reliable causal discovery with improved exact search and weaker assumptions. In Advances in Neural Information Processing Systems, 2021.
  27. Pearl, J. Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge, 2000.
  28. Adjacency-faithfulness and conservative causal inference. arXiv preprint arXiv:1206.6843, 2012.
  29. Towards causal representation learning. Proceedings of the IEEE, 109(5):612–634, 2021.
  30. Learning the structure of linear latent variable models. Journal of Machine Learning Research, 7:191–246, 2006.
  31. Disentanglement by nonlinear ICA with general incompressible-flow networks (GIN). arXiv preprint arXiv:2001.04872, 2020.
  32. Causation, Prediction, and Search. MIT Press, Cambridge, MA, 2nd edition, 2001.
  33. Linear causal disentanglement via interventions. In International Conference on Machine Learning, 2023.
  34. Source separation in post-nonlinear mixtures. IEEE Transactions on signal Processing, 47(10):2807–2820, 1999.
  35. Geometry of the faithfulness assumption in causal inference. The Annals of Statistics, pp.  436–463, 2013.
  36. Score-based causal representation learning with interventions. arXiv preprint arXiv:2301.08230, 2023.
  37. Nonparametric identifiability of causal representations from unknown interventions. arXiv preprint arXiv:2306.00542, 2023.
  38. Generalized independent noise condition for estimating latent variable causal graphs. In Advances in Neural Information Processing Systems, 2020.
  39. Identification of linear non-gaussian latent hierarchical structure. In International Conference on Machine Learning, pp.  24370–24387. PMLR, 2022.
  40. SmartBrush: Text and shape guided object inpainting with diffusion model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  620–629, 2023.
  41. Learning temporally causal latent processes from general temporal data. arXiv preprint arXiv:2110.05428, 2021.
  42. Temporally disentangled representation learning. In Advances in Neural Information Processing Systems, 2022.
  43. Detection of unfaithfulness and robust causal inference. Minds and Machines, 18:239–271, 2008.
  44. On the identifiability of nonlinear ICA: Sparsity and beyond. In Advances in Neural Information Processing Systems, 2022.
Citations (8)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com