Expressivity of transformers and linear SSMs on state-tracking tasks below permutation composition

Determine whether fixed-depth transformers and generalized linear state-space models, specifically S4 and the S6/Mamba layer, can express solutions to state-tracking problems that are less complex than the S5 permutation composition word problem.

Background

The authors establish that linear SSMs (S4, S6/Mamba), like transformers, are confined to L-uniform TC^0, implying inexpressibility for NC^1-hard state-tracking tasks such as the S5 word problem. They provide empirical evidence showing these models struggle on S5 compared to RNNs.

They further report that these models also struggle on seemingly easier state-tracking problems, raising a theoretical question about whether such tasks are expressible at all by fixed-depth transformers and linear SSMs. Clarifying this would delineate the precise expressivity frontier of these architectures below NC^1-hardness.

References

We also find that both transformers and SSMs struggle compared to RNNs on state-tracking problems less complex than permutation composition where it is not known whether they can express a solution.

— The Illusion of State in State-Space Models (2404.08819 - Merrill et al., 12 Apr 2024) in Introduction, empirical findings paragraph

Expressivity of transformers and linear SSMs on state-tracking tasks below permutation composition

Sponsor

Background

References

Related Problems