On the Identifiability of Nonlinear ICA: Sparsity and Beyond (2206.07751v5)
Abstract: Nonlinear independent component analysis (ICA) aims to recover the underlying independent latent sources from their observable nonlinear mixtures. How to make the nonlinear ICA model identifiable up to certain trivial indeterminacies is a long-standing problem in unsupervised learning. Recent breakthroughs reformulate the standard independence assumption of sources as conditional independence given some auxiliary variables (e.g., class labels and/or domain/time indexes) as weak supervision or inductive bias. However, nonlinear ICA with unconditional priors cannot benefit from such developments. We explore an alternative path and consider only assumptions on the mixing process, such as Structural Sparsity. We show that under specific instantiations of such constraints, the independent latent sources can be identified from their nonlinear mixtures up to a permutation and a component-wise transformation, thus achieving nontrivial identifiability of nonlinear ICA without auxiliary variables. We provide estimation methods and validate the theoretical results experimentally. The results on image data suggest that our conditions may hold in a number of practical data generating processes.
- Adaptive estimation in structured factor models with applications to overlapping clustering. The Annals of Statistics, 48(4):2055–2081, 2020.
- Explorability and the origin of network sparsity in living systems. Scientific reports, 7(1):1–8, 2017.
- P. Comon. Independent component analysis, a new concept? Signal processing, 36(3):287–314, 1994.
- G. Darmois. Analyse des liaisons de probabilité. In Proc. Int. Stat. Conferences 1947, page 231, 1951.
- D. G. De Paor. Orthographic analysis of geological structures—I. deformation theory. Journal of Structural Geology, 5(3-4):255–277, 1983.
- Density estimation using Real NVP. arXiv preprint arXiv:1605.08803, 2016.
- D. L. Donoho and M. Elad. Optimally sparse representation in general (nonorthogonal) dictionaries via L1subscript𝐿1L_{1}italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT minimization. Proceedings of the National Academy of Sciences, 100(5):2197–2202, 2003.
- A. Einstein. Does the inertia of a body depend upon its energy-content. Annalen der Physik, 18(13):639–641, 1905.
- H. Flanders. Liouville’s theorem on conformal mapping. Journal of Mathematics and Mechanics, 15(1):157–161, 1966.
- The frugal inference of causal relations. The British Journal for the Philosophy of Science, 2020.
- Independent mechanism analysis, a new concept? Advances in Neural Information Processing Systems, 2021.
- H. Hälvä and A. Hyvärinen. Hidden markov nonlinear ICA: Unsupervised learning from nonstationary time series. In Conference on Uncertainty in Artificial Intelligence, pages 939–948. PMLR, 2020.
- Disentangling identifiable features from noisy data with structured nonlinear ICA. Advances in Neural Information Processing Systems, 34, 2021.
- A. Hyvärinen and H. Morioka. Unsupervised feature extraction by time-contrastive learning and nonlinear ICA. Advances in Neural Information Processing Systems, 29:3765–3773, 2016.
- A. Hyvärinen and H. Morioka. Nonlinear ICA of temporally dependent stationary sources. In International Conference on Artificial Intelligence and Statistics, pages 460–469. PMLR, 2017.
- A. Hyvärinen and E. Oja. A fast fixed-point algorithm for independent component analysis. Neural computation, 9(7):1483–1492, 1997.
- A. Hyvärinen and P. Pajunen. Nonlinear independent component analysis: Existence and uniqueness results. Neural networks, 12(3):429–439, 1999.
- Nonlinear ICA using auxiliary variables and generalized contrastive learning. In International Conference on Artificial Intelligence and Statistics, pages 859–868. PMLR, 2019.
- Overdetermined blind source separation: Using more sensors than source signals in a noisy mixture. In Proc. ICA, pages 81–86, 2000.
- Variational autoencoders and nonlinear ICA: A unifying framework. In International Conference on Artificial Intelligence and Statistics, pages 2207–2217. PMLR, 2020.
- D. P. Kingma and P. Dhariwal. Glow: Generative flow with invertible 1111x1111 convolutions. Advances in neural information processing systems, 31, 2018.
- Disentanglement via mechanism sparsity regularization: A new principle for nonlinear ICA. Conference on Causal Learning and Reasoning, 2022.
- Challenging common assumptions in the unsupervised learning of disentangled representations. In international conference on machine learning, pages 4114–4124. PMLR, 2019.
- Inverse dynamics control of floating base systems using orthogonal decomposition. In 2010 IEEE international conference on robotics and automation, pages 3406–3412. IEEE, 2010.
- Identifiable variational autoencoders via sparse decoding. arXiv preprint arXiv:2110.10804, 2021.
- L. K. Nash. The nature of the natural sciences. 1963.
- E. Oja. Unsupervised learning in neural computation. Theoretical computer science, 287(1):187–207, 2002.
- G. Raskutti and C. Uhler. Learning directed acyclic graph models based on sparsest permutations. Stat, 7(1):e183, 2018.
- T. Rhodes and D. Lee. Local disentanglement in variational auto-encoders using jacobian l_1𝑙_1l\_1italic_l _ 1 regularization. Advances in Neural Information Processing Systems, 34:22708–22719, 2021.
- K. Rohe and M. Zeng. Vintage factor analysis with varimax performs statistical inference. arXiv preprint arXiv:2004.05387, 2020.
- Disentanglement by nonlinear ICA with general incompressible-flow networks (GIN). arXiv preprint arXiv:2001.04872, 2020.
- Causation, prediction, and search. MIT press, 2000.
- A. Taleb and C. Jutten. Source separation in post-nonlinear mixtures. IEEE Transactions on signal Processing, 47(10):2807–2820, 1999.
- M. Willetts and B. Paige. I don’t need 𝐮𝐮\mathbf{u}bold_u: Identifiable non-linear ICA without side information. arXiv preprint arXiv:2106.05238, 2021.
- Nonlinear ICA using volume-preserving transformations. In International Conference on Learning Representations, 2022.
- Learning temporally causal latent processes from general temporal data. arXiv preprint arXiv:2110.05428, 2021.
- J. Zhang. A comparison of three occam’s razors for markovian causal models. The British journal for the philosophy of science, 64(2):423–448, 2013.
- ICA with sparse connections: Revisited. In International Conference on Independent Component Analysis and Signal Separation, pages 195–202. Springer, 2009.
- iVPF: Numerical invertible volume preserving flow for efficient lossless compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 620–629, 2021.