Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Causal Component Analysis (2305.17225v3)

Published 26 May 2023 in stat.ML, cs.AI, and cs.LG

Abstract: Independent Component Analysis (ICA) aims to recover independent latent variables from observed mixtures thereof. Causal Representation Learning (CRL) aims instead to infer causally related (thus often statistically dependent) latent variables, together with the unknown graph encoding their causal relationships. We introduce an intermediate problem termed Causal Component Analysis (CauCA). CauCA can be viewed as a generalization of ICA, modelling the causal dependence among the latent components, and as a special case of CRL. In contrast to CRL, it presupposes knowledge of the causal graph, focusing solely on learning the unmixing function and the causal mechanisms. Any impossibility results regarding the recovery of the ground truth in CauCA also apply for CRL, while possibility results may serve as a stepping stone for extensions to CRL. We characterize CauCA identifiability from multiple datasets generated through different types of interventions on the latent causal variables. As a corollary, this interventional perspective also leads to new identifiability results for nonlinear ICA -- a special case of CauCA with an empty graph -- requiring strictly fewer datasets than previous results. We introduce a likelihood-based approach using normalizing flows to estimate both the unmixing function and the causal mechanisms, and demonstrate its effectiveness through extensive synthetic experiments in the CauCA and ICA setting.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (63)
  1. Weakly supervised representation learning with sparse perturbations. In Advances in Neural Information Processing Systems, volume 35, pages 15516–15528, 2022.
  2. Interventional causal representation learning. In International Conference on Machine Learning, pages 372–407. PMLR, 2023.
  3. Weakly supervised causal representation learning. Advances in Neural Information Processing Systems, 35:38319–38331, 2022.
  4. Function classes for identifiable nonlinear independent component analysis. Advances in Neural Information Processing Systems, 35:16946–16961, 2022.
  5. Learning linear causal representations from interventions under general nonlinear mixing. In Advances in Neural Information Processing Systems, 2023.
  6. Triad constraints for learning causal structure of latent variables. In Advances in Neural Information Processing Systems, volume 32, pages 12883–12892, 2019.
  7. P. Comon. Independent component analysis, a new concept? Signal processing, 36(3):287–314, 1994.
  8. G. Darmois. Analyse des liaisons de probabilité. In Proc. Int. Stat. Conferences 1947, page 231, 1951.
  9. Identifiability results for multimodal contrastive learning. In The Eleventh International Conference on Learning Representations, 2022.
  10. Neural spline flows. Advances in Neural Information Processing Systems, 32, 2019.
  11. The Incomplete Rosetta Stone problem: Identifiability results for multi-view nonlinear ICA. In Uncertainty in Artificial Intelligence, pages 217–227. PMLR, 2019.
  12. Relative gradient optimization of the Jacobian term in unsupervised deep learning. Advances in neural information processing systems, 33:16567–16578, 2020.
  13. Independent mechanism analysis, a new concept? In Advances in neural information processing systems, volume 34, pages 28233–28248, 2021.
  14. H. Hälvä and A. Hyvärinen. Hidden markov nonlinear ICA: Unsupervised learning from nonstationary time series. In Conference on Uncertainty in Artificial Intelligence, pages 939–948. PMLR, 2020.
  15. Disentangling identifiable features from noisy data with structured nonlinear ICA. Advances in Neural Information Processing Systems, 34:1624–1633, 2021.
  16. A. Hyvärinen and P. Hoyer. Emergence of phase-and shift-invariant features by decomposition of natural images into independent feature subspaces. Neural computation, 12(7):1705–1720, 2000.
  17. A. Hyvärinen and H. Morioka. Unsupervised feature extraction by time-contrastive learning and nonlinear ICA. Advances in neural information processing systems, 29, 2016.
  18. A. Hyvärinen and H. Morioka. Nonlinear ICA of temporally dependent stationary sources. In Artificial Intelligence and Statistics, pages 460–469. PMLR, 2017.
  19. A. Hyvärinen and P. Pajunen. Nonlinear independent component analysis: Existence and uniqueness results. Neural networks, 12(3):429–439, 1999.
  20. Topographic independent component analysis. Neural computation, 13(7):1527–1558, 2001.
  21. Nonlinear ICA using auxiliary variables and generalized contrastive learning. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 859–868. PMLR, 2019.
  22. Identifiability of latent-variable and structural-equation models: from linear to nonlinear. arXiv preprint arXiv:2302.02672, 2023.
  23. Causal normalizing flows: from theory to practice. In Advances in Neural Information Processing Systems, 2023.
  24. T. A. Keller and M. Welling. Topographic VAEs learn equivariant capsules. Advances in Neural Information Processing Systems, 34:28585–28597, 2021.
  25. Variational autoencoders and nonlinear ICA: A unifying framework. In International Conference on Artificial Intelligence and Statistics, pages 2207–2217. PMLR, 2020a.
  26. Ice-BeeM: Identifiable conditional energy-based deep models based on nonlinear ICA. Advances in Neural Information Processing Systems, 33:12768–12778, 2020b.
  27. D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In Y. Bengio and Y. LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015.
  28. Identifiability of deep generative models without auxiliary information. In Advances in Neural Information Processing Systems, volume 35, pages 15687–15701, 2022.
  29. S. Lachapelle and S. Lacoste-Julien. Partial disentanglement via mechanism sparsity. In UAI 2022 Workshop on Causal Representation Learning, 2022.
  30. Disentanglement via mechanism sparsity regularization: A new principle for nonlinear ICA. In Conference on Causal Learning and Reasoning, pages 428–484. PMLR, 2022.
  31. Structure by architecture: Structured representations without regularization. Proceedings of the Eleventh International Conference on Learning Representations (ICLR), 2023. arXiv preprint 2006.07796.
  32. E. L. Lehmann and G. Casella. Theory of point estimation. Springer Science & Business Media, 2006.
  33. Citris: Causal identifiability from temporal intervened sequences. In International Conference on Machine Learning, pages 13557–13603. PMLR, 2022.
  34. Causal representation learning for instantaneous and temporal effects in interactive systems. In The Eleventh International Conference on Learning Representations, 2023.
  35. Weight-variant latent causal models. arXiv e-prints, pages arXiv–2208, 2022.
  36. Weakly-supervised disentanglement without compromises. In International Conference on Machine Learning, pages 6348–6359. PMLR, 2020.
  37. Understanding latent correlation-based multiview learning and self-supervision: An identifiability perspective. In International Conference on Learning Representations, 2022.
  38. Causal discovery with general non-linear relationships using non-linear ICA. In Uncertainty in artificial intelligence, pages 186–195. PMLR, 2020.
  39. H. Morioka and A. Hyvärinen. Connectivity-contrastive learning: Combining causal discovery and representation learning for multimodal data. In International Conference on Artificial Intelligence and Statistics, pages 3399–3426. PMLR, 2023.
  40. Normalizing flows for probabilistic modeling and inference. The Journal of Machine Learning Research, 22(1):2617–2680, 2021.
  41. J. Pearl. Causality. Cambridge university press, 2009.
  42. Causal discovery in heterogeneous environments under the sparse mechanism shift hypothesis. In Advances in Neural Information Processing Systems, 2022.
  43. Elements of Causal Inference: Foundations and Learning Algorithms. The MIT Press, 2017.
  44. A. Sauer and A. Geiger. Counterfactual generative networks. In International Conference on Learning Representations (ICLR), 2021.
  45. Toward Causal Representation Learning. Proceedings of the IEEE, 109(5):612–634, 2021.
  46. Weakly supervised disentangled generative causal representation learning. Journal of Machine Learning Research, 23:1–55, 2022.
  47. Learning the structure of linear latent variable models. Journal of Machine Learning Research, 7(2), 2006.
  48. Causation, prediction, and search. MIT press, 2000.
  49. Linear causal disentanglement via interventions. In A. Krause, E. Brunskill, K. Cho, B. Engelhardt, S. Sabato, and J. Scarlett, editors, International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 of Proceedings of Machine Learning Research, pages 32540–32560. PMLR, 2023.
  50. Unsupervised object learning via common fate. In 2nd Conference on Causal Learning and Reasoning (CLeaR). 2021. arXiv:2110.06562.
  51. On disentangled representations learned from correlated data. In International Conference on Machine Learning, pages 10401–10412. PMLR, 2021.
  52. Score-based causal representation learning with interventions. arXiv preprint arXiv:2301.08230, 2023.
  53. General identifiability and achievability for causal representation learning. arXiv preprint arXiv:2310.15450, 2023.
  54. Self-supervised learning with data augmentations provably isolates content from style. Advances in neural information processing systems, 34:16451–16467, 2021.
  55. Nonparametric identifiability of causal representations from unknown interventions. In Advances in Neural Information Processing Systems, 2023.
  56. Y. Wang and M. I. Jordan. Desiderata for representation learning: A causal perspective. arXiv preprint arXiv:2109.03795, 2021.
  57. L. Wasserman. All of statistics: a concise course in statistical inference, volume 26. Springer, 2004.
  58. Q. Xi and B. Bloem-Reddy. Indeterminacy in generative models: Characterization and strong identifiability. In International Conference on Artificial Intelligence and Statistics, pages 6912–6939. PMLR, 2023.
  59. Generalized independent noise condition for estimating latent variable causal graphs. In Advances in Neural Information Processing Systems, volume 33, pages 14891–14902, 2020.
  60. Identification of linear non-gaussian latent hierarchical structure. In International Conference on Machine Learning, pages 24370–24387. PMLR, 2022.
  61. CausalVAE: Disentangled representation learning via neural structural causal models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9593–9602, 2021.
  62. Learning temporally causal latent processes from general temporal data. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022.
  63. K. Zhang and A. Hyvärinen. On the identifiability of the post-nonlinear causal model. In 25th Conference on Uncertainty in Artificial Intelligence (UAI 2009), pages 647–655. AUAI Press, 2009.
Citations (28)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com