General Identifiability Theorem

Updated 1 April 2026

General Identifiability Theorem is a framework that establishes necessary and sufficient conditions for uniquely determining parameters or latent structures from observed data using an injectivity condition.
It unifies classical, semiparametric, and nonparametric settings, extending to graphical causal models, latent variable analysis, and nonlinear dynamic systems.
The theorem's algebraic and algorithmic methodologies enable practical identification in complex models, influencing fields such as causal inference, policy learning, and control.

A general identifiability theorem provides precise, necessary and sufficient conditions under which a parameter, latent structure, causal effect, or system feature can be uniquely determined from observed data—possibly under minimal assumptions on the generative process, environment, or intervention structure. Modern general identifiability results encompass classical parametric identifiability, semiparametric/nonparametric latent variable models, non-Gaussian and non-linear settings, structural equation models, ODEs with time-varying parameters, and causal representation frameworks. The “general” aspect signifies both algebraic sharpness of identification barriers and broad applicability across model classes.

1. Formal Definition and Universal Characterization

A unifying foundation is provided by a relation-theoretic approach: Let $S$ be a “statistical universe”, $\lambda\colon S\to\Lambda$ an observation mapping (which may represent observed distribution, marginal law, sample, etc.), and $\theta\colon S\to\Theta$ an estimand (parameter/functional/target). The identification region at $\ell_0$ is $H\{\theta;\ell_0\} := \{\theta(S) : S\in S,\ \lambda(S)=\ell_0\}$ .

The general identifiability theorem, as formulated in (Basse et al., 2020), states that $\theta$ is (everywhere) identifiable from $\lambda$ if and only if for all $S, S' \in S$ ,

$\lambda(S) = \lambda(S') \implies \theta(S) = \theta(S').$

Equivalently, there exists a well-defined function $f: \Lambda \to \Theta$ such that $\lambda\colon S\to\Lambda$ 0 for all $\lambda\colon S\to\Lambda$ 1. Thus, identifiability reduces to injectivity of the induced binary relation $\lambda\colon S\to\Lambda$ 2 (Basse et al., 2020). This abstraction subsumes parametric, semiparametric, and nonparametric settings and enables a uniform treatment of partial and structural identifiability.

2. Graphical and Causal Identifiability Theorems

Modern graphical causal inference distinguishes between classical identifiability from observational data (e.g., do-free rules, as in Pearl 1995), and general identifiability given arbitrary families of observational and interventional distributions (Kivva et al., 2022). For a causal effect $\lambda\colon S\to\Lambda$ 3, general identifiability (gID) from a family $\lambda\colon S\to\Lambda$ 4 of available interventional marginals is defined as equality for all models agreeing on those marginals.

The general identifiability criterion (Kivva et al., 2022) is: Let $\lambda\colon S\to\Lambda$ 5 be a single c-component in the observed variable set $\lambda\colon S\to\Lambda$ 6 of a DAG $\lambda\colon S\to\Lambda$ 7. Then $\lambda\colon S\to\Lambda$ 8 (interventional distribution) is g-identifiable from $\lambda\colon S\to\Lambda$ 9 if and only if there exists $\theta\colon S\to\Theta$ 0 with $\theta\colon S\to\Theta$ 1 such that $\theta\colon S\to\Theta$ 2 is classically identifiable from $\theta\colon S\to\Theta$ 3 (i.e., from the induced subgraph). This reduces the general gID problem to a finite family of classical subproblems, establishing both soundness and completeness of the algorithm under positivity assumptions for the observed distributions. The result resolves issues of earlier definitions that ignored support constraints, leading to unsound identification claims (Kivva et al., 2022).

3. General Identifiability in Latent Variable and Latent Structure Models

For latent variable models, general identifiability conditions have been established that dramatically weaken traditional “pure child” assumptions. In binary latent causal graphical models with arbitrary observed variable types, identifiability of the entire latent structure is guaranteed under the double-triangular condition on the measurement graph and a non-subset condition on the supports (Lee et al., 23 May 2025). Specifically:

If the observed-to-latent bipartite graph $\theta\colon S\to\Theta$ 4 contains two disjoint full $\theta\colon S\to\Theta$ 5 triangular blocks and no two columns are nested, then the number of latents, the measurement structure, the full latent DAG, and the conditional laws are all identifiable from $\theta\colon S\to\Theta$ 6 (up to label-switching, sign-flip, and Markov equivalence on the latent DAG) (Lee et al., 23 May 2025).
The necessity part proves that at least three observed children per latent are required and that the non-subset condition is sharp.

For nonparametric high-dimensional mixture models, the general identifiability theorem states that, provided the sum of blockwise “excess Kruskal ranks” over a triple partition exceeds $\theta\colon S\to\Theta$ 7 (for $\theta\colon S\to\Theta$ 8 mixture components), the model is generically identifiable (up to permutation of components) (Lyu et al., 10 Jun 2025). This is formulated algebraically in the tensor product space, generalizing Allman–Matias–Rhodes and Kruskal uniqueness theory.

4. General Identifiability in Dynamic and Nonlinear Systems

For ODE models with possibly time-varying unknown inputs/parameters, the general identifiability theorem provides an explicit, fully algebraic, necessary and sufficient test (Martinelli, 2023, Martinelli, 2022). The system is analyzed by constructing its observability codistribution, computing the degree of reconstructability with respect to unknown parameters, and identifying the symmetry directions in parameter space.

Key elements:

Every (possibly time-varying) parameter is locally identifiable if and only if there are no nontrivial infinitesimal output-preserving symmetries along its direction (determined by the orthogonal complement of the observability codistribution and the reconstructability matrix).
When unidentifiable, an explicit parametric family of equivalent trajectories/parameters can be constructed.
The methodology is fully algorithmic: derivative, Lie bracket, and rank computations determine identifiability without ad-hoc elimination or power series.

This general analytic solution subsumes earlier approaches, applies to arbitrary nonlinearities, and treats time-varying parameters as unknown inputs.

5. Identifiability under General Interventions and Environments

Causal representation learning under multiple “environments”—not restricted to single-node interventions—exhibits a universal barrier: in linear models with independent non-Gaussian noise, with enough general environmental diversity and node-level nondegeneracy, the latent causal graph is fully recoverable, but the variables are only identifiable up to Surrounded-Node Ambiguity (SNA) (Jin et al., 2023). SNA is formally characterized: each variable can only be recovered up to an invertible transformation mixing it with its surrounded-node set (parents $\theta\colon S\to\Theta$ 9 of $\ell_0$ 0 such that children of $\ell_0$ 1 are contained in children of $\ell_0$ 2).

This ambiguity cannot be broken except via hard surgical interventions. A parallel result holds for nonparametric SEMs under groups of soft single-node interventions. Thus, under general environment shifts, identifiability holds only up to SNA, which is both sharp and unavoidable absent do-context interventions (Jin et al., 2023).

Algorithmically, models such as LiNGCReL provably recover the correct structure (up to SNA) using algebraic decomposition, ICA alignment, and inductive support identification.

6. Extensions: Optimization, Policy Learning, and Control

General identifiability theorems underlie extensions in optimization and reinforcement learning:

In spectral optimization, identifiability of subdifferential manifolds for orthogonally invariant functions is characterized by the local symmetry of the corresponding functions on the diagonal, leading to equivalence of partial smoothness and identifiability lifts between diagonal and spectral representations (Daniilidis et al., 2013).
For nonparametric Markov Decision Processes, the identifiability of transition kernels is ensured by compactness in the bounded-Lipschitz topology, full prior support, and the use of exhaustive exploration policies. Both Bayesian (posterior concentration in the topology) and empirical (occupancy-based) estimators converge to the true kernel, guaranteeing near-optimal learning and control (Mrani-Zentar et al., 14 Mar 2025).

7. Representative Table: Principal General Identifiability Results

Setting	Key Identifiability Barrier/Theorem	arXiv Reference
General functional/statistical mapping	Injectivity of $\ell_0$ 3 relation	(Basse et al., 2020)
Graphical causal inference (gID)	Reduces to classical subproblems; positivity required	(Kivva et al., 2022)
BLCM, arbitrary observed types	Double-triangular + no subset inclusion	(Lee et al., 23 May 2025)
High-dim. nonparametric latent mixture	Sum of blockwise Kruskal ranks ≥ $\ell_0$ 4	(Lyu et al., 10 Jun 2025)
Nonlinear ODEs with time-varying param.	No “symmetry” in codistribution; reconstructability test	(Martinelli, 2023, Martinelli, 2022)
Causal repr. learning, multiple env.	Identifiability up to SNA; hard interventions needed to break	(Jin et al., 2023)

Each of these results precisely characterizes the minimal structural, algebraic, or intervention conditions required and provides either explicit construction of identification mappings or sharp algebraic barriers to uniqueness.

This synthesis reflects the current mathematical and algorithmic state-of-the-art in general identifiability theory as grounded in contemporary arXiv literature.