Expressivity Gap Analysis

Updated 31 May 2026

Expressivity gap analysis is a framework for identifying mismatches between a model's representational capacity and task requirements, focusing on both structural and quantitative metrics.
It employs rigorous methodologies such as entropic measures, margin-based approaches, and geometric analysis to diagnose and quantify model limitations.
The analysis informs architectural innovations and performance predictions in fields like machine learning, quantum algorithms, and reinforcement learning.

Expressivity gap analysis refers to the rigorous identification, quantification, and diagnosis of the mismatch between the representational or functional capacity of a model family and the requirements imposed by tasks, data, or target solution manifolds. This analytic framework is central in contemporary machine learning, signal processing, quantum algorithms, reinforcement learning, and formal systems, as it connects model design to performance limitations and catalyzes methodology for architectural innovation and evaluation.

1. Fundamental Notions and Definitions

Expressivity is the capacity of a model, system, or formalism to represent, distinguish, or approximate functions, mappings, or distributions relevant for a given domain. The expressivity gap arises when this capacity is insufficient—either globally (entire domains cannot be realized) or locally (critical distinctions or transformations are unreachable for given parameter, depth, or data constraints).

There are two principal framings:

Structural expressivity gap: The set of functions, queries, or distributions that the system cannot represent or differentiate, often formalized through entailment, separation, or universality tests. For example, GNNs bounded by the Weisfeiler–Leman (1-WL) test cannot distinguish certain non-isomorphic graphs, inducing a gap relative to required topological discrimination (Ballester et al., 2023, Kemper et al., 1 Sep 2025).
Quantitative expressivity gap: A numeric metric characterizing the "distance" between attainable outputs and desired targets, such as the entropy, margin, covering number, or effective rank of a parameterized family (Lauw et al., 2019, Lezeau et al., 2024, Yao, 18 Jun 2025). Here, expressivity gaps arise due to underparameterization, architectural bottlenecks, noise, or formal restrictions and are often accompanied by explicit bounds.

2. Frameworks and Metrics for Expressivity Gap Analysis

Multiple rigorous frameworks have been developed:

a. Entropic and Information-Theoretic Measures

Expressivity is often quantified by entropy or covering number. In "The Bias-Expressivity Trade-off" (Lauw et al., 2019), flexibility is measured via Shannon entropy of the algorithm's outcome distribution, while bias quantifies deviation from random sampling, yielding explicit bounds: $H(\overline P_\mathcal{D}) \leq \log_2|\Omega| - 2 \operatorname{bias}(\mathcal{D}, t)^2$ This establishes a formal expressivity gap as quadratic in bias.

b. Functional and Margin-Based Approaches

In GNN and kernel learning, expressivity gaps are related to margin-based VC-dimension bounds. Enhanced expressivity (e.g., via subgraph-augmented WL kernels) only yields a practical benefit if it increases the margin separating classes, as described by the ratio $VC(H_1)/VC(H_2) \approx (r_1^2/\lambda_1^2) / (r_2^2/\lambda_2^2)$ (Franks et al., 2024).

c. Geometric and Dimensional Analysis

Parametric quantum circuits’ expressivity is captured via the rank of the Jacobian: $\dim T_{\psi(\theta)} = \operatorname{rank} J(\theta)$ (Funcke et al., 2020), or via the effective rank $\kappa = \operatorname{rank} F$ of the Fisher information matrix (Yao, 18 Jun 2025). The expressivity gap appears where the actual expressivity dimension is lower than theoretical maximum or intended manifold dimension. This is further refined by tropical geometry tools for ReLU nets, where the number of linear regions underpins expressivity, and analytic methods (region counting, Hoffman constants) provide certified gap quantification (Lezeau et al., 2024).

d. Universal Approximation and Universality Theorems

For architectures such as attention-based networks, contextual or in-context expressivity is formalized via sequence disentangling and universal approximation results (i.e., every continuous map on $\Omega \times \mathcal{P}(\Omega)$ can be approximated to arbitrary precision) (Boufadène et al., 12 Dec 2025). An expressivity gap is certified by failure of these universality properties—a scenario not found between sliced ReLU and softmax attention, but often present in restricted or quantized models.

3. Exemplary Analyses and Domains

Speech-to-Speech Models

DeEAR measures expressivity across emotion, prosody, and spontaneity, aligning scoring with human judgment (SRCC = 0.86). Expressivity gaps between S2S systems are observed when models trained on neutral corpora lack emotional expression or spontaneity, as revealed by the differential scores on each sub-dimension. Targeted data curation using DeEAR improves model expressivity, quantifying the gap closure achieved (Lin et al., 23 Oct 2025).

Low-Resource SLMs: Stability–Expressivity Trade-off

When scaling spoken LLMs (SLMs) for low-resource settings, synthetic data improves stability (phonetic accuracy) at the cost of expressivity, inducing synthetic erosion beyond a critical mixing ratio $\alpha^*$ . The gap is diagnosed via entropy proxy metrics, and closing strategies employ preference-alignment frameworks (DGSA, TDSC) that restore expressivity without sacrificing stability (Geng et al., 10 Apr 2026).

Probabilistic Circuits and LLMs

Probabilistic circuits (PCs) suffer a provable expressivity gap versus Transformers in language modeling. This gap is due to:

Output bottleneck: Convex combinations in probability space limit the sharpness of next-token distributions; a logit-space parameterization narrows but does not eliminate this.
Context-encoding bottleneck: Structured-decomposable PCs match Transformer separation rank only on vtree-aligned partitions. Arbitrary data dependencies break this equivalence, yielding severe degradation in mixed-topology tasks, whereas Transformers' dynamic attention maintains high separation rank universally (Zhao et al., 13 May 2026).

Recurrent Neural Networks and Depth-Induced Gap

Increasing depth in (linear) RNNs strictly increases memory capacity for a given parameter budget, as formalized by $\mathcal{M}(L,n) = L(n-1)$ . For 2RNNs, depth further exponentiates the maximal degree of representable polynomials, and multiplicative interactions cannot be replaced by simple depth-wise nonlinearities—constituting an expressivity gap for shallow or purely additive-depth variants (Lizaire et al., 2 Apr 2026).

Graph Machine Learning

The classical 1-WL expressivity gap prevents message-passing GNNs from distinguishing many non-isomorphic graphs. Persistent homology features can provably separate such structures (including cases indistinguishable by 2-WL), establishing a strict gap and identifying the advantage of topological features in molecular and social graph classification (Ballester et al., 2023). Recent work further demonstrates that task-related expressivity gaps (as opposed to worst-case graph pairs) are strongly correlated with Message Passing Complexity (MPC), a continuous measure correlating with empirical sample complexity and over-squashing phenomena (Kemper et al., 1 Sep 2025).

RL Objective Specification

A comprehensive hierarchy of objective-specification formalisms in RL reveals a strict expressivity gap between, for example, Markov-reward, LTL, reward machines, and generalized outer-multi-objective RL. Separation tasks are constructed to show that each formalism uniquely admits tasks outside the reach of others; for instance, RRL expresses most-deterministic-policy preferences, which ONMR cannot, due to linearity constraints (Subramani et al., 2023).

4. Diagnosis and Quantification Workflows

The methodological pipeline for expressivity gap analysis typically involves:

Rigorous formalization of target capacity: Identify the set of tasks, distributions, or functions mandated by application, including required symmetries, constraints, or other domain-specific characteristics.
Capacity measurement under resource constraints: Compute or estimate dimensional, entropic, or separability metrics (e.g., rank of Jacobian or Fisher matrix, covering number, margin, separation rank).
Comparative evaluation: Quantitatively compare measured model capacity to the minimal required for the application. For instance, the expressivity gap in VQAs is the deficit in covering number relative to the target (Du et al., 2021).
Interpretation and closure: If an expressivity gap is diagnosed, models are redesigned or fine-tuned until the gap is minimized or eliminated, subject to trade-offs with trainability and generalization.

5. Impact and Applications Across Domains

Expressivity gap analysis is foundational for:

Model selection and architecture design: E.g., in QNNs, maximizing effective rank $\kappa$ by circuit layout optimization ensures full functional capacity with minimal redundancy (Yao, 18 Jun 2025); in structured data querying, type-discipline and staged MQuery pipelines guarantee equivalence with the desired algebraic expressivity (Botoeva et al., 2016).
Data curation: Methods like DeEAR enable the construction of high-expressivity corpora tailored for S2S fine-tuning (Lin et al., 23 Oct 2025).
Task feasibility diagnosis and performance prediction: In GNNs, MPC predicts observed failures in long-range tasks, subsuming the binary limitations of expressivity theory (Kemper et al., 1 Sep 2025).
Understanding limits of formal systems: RL objective formalism hierarchy clarifies which classes of objectives are unattainable and thereby guides algorithmic focus or reward learning (Subramani et al., 2023).

6. Limitations and Future Directions

Not all expressivity gaps are negative: over-expressivity can degrade generalization via increased covering number, plateauing trainability, or memory constraints (e.g., barren plateaus in QNNs (Du et al., 2021)). The challenge of balancing expressivity against tractability, robustness, and regularization is central.

Active research addresses:

Learning formalism mixing: E.g., hybrid PC-logit architectures, or mixture-of-vtree models in PCs, to approach Transformer flexibility within tractable inference regimes (Zhao et al., 13 May 2026).
Generalization-aware design: Margin-theoretic analysis identifies when additional expressivity benefits generalization margin and VC-dimension, avoiding redundant augmentation (Franks et al., 2024).
Efficient software and symbolic tools: Tropical geometry and OSCAR enable scalable, exact analysis of neural region counts, crucial for principled architecture evaluation (Lezeau et al., 2024).

Continued methodological developments at the intersection of statistical learning theory, algebraic geometry, information theory, and combinatorics remain vital for rigorous characterization and minimization of expressivity gaps in next-generation AI systems.