Expressivity of GDN without negative eigenvalues versus transformers

Determine whether Gated DeltaNet without the negative-eigenvalue extension has expressive-power advantages over fixed-depth transformers in the paper’s formal setting, for example by being able to solve at least one NC1-complete problem, thereby exceeding the transformer’s TC0-bound expressivity under the log-precision arithmetic model.

Background

Negative eigenvalues in Gated DeltaNet are argued to be important for state-tracking expressivity; nevertheless, the authors observe similar empirical scaling trends between GDN with and without negative eigenvalues.

This observation raises the theoretical question of whether GDN without negative eigenvalues might still exceed transformer expressivity (e.g., by solving some NC1-complete tasks), or whether its advantages are due to factors other than expressivity.

References

This could suggest that some other benefit of GDN beyond expressivity explains its improved scaling relative to transformers, though it is also an open question whether GDN, even without negative eigenvalues, could have expressivity advantages over transformers such as the ability to solve some NC1-complete problems.

Olmo Hybrid: From Theory to Practice and Back  (2604.03444 - Merrill et al., 3 Apr 2026) in Section 4.3 (Discussion: From Expressivity to Scaling)