Dice Question Streamline Icon: https://streamlinehq.com

Expressivity of transformers and linear SSMs on state-tracking tasks below permutation composition

Determine whether fixed-depth transformers and generalized linear state-space models, specifically S4 and the S6/Mamba layer, can express solutions to state-tracking problems that are less complex than the S5 permutation composition word problem.

Information Square Streamline Icon: https://streamlinehq.com

Background

The authors establish that linear SSMs (S4, S6/Mamba), like transformers, are confined to L-uniform TC0, implying inexpressibility for NC1-hard state-tracking tasks such as the S5 word problem. They provide empirical evidence showing these models struggle on S5 compared to RNNs.

They further report that these models also struggle on seemingly easier state-tracking problems, raising a theoretical question about whether such tasks are expressible at all by fixed-depth transformers and linear SSMs. Clarifying this would delineate the precise expressivity frontier of these architectures below NC1-hardness.

References

We also find that both transformers and SSMs struggle compared to RNNs on state-tracking problems less complex than permutation composition where it is not known whether they can express a solution.

The Illusion of State in State-Space Models (2404.08819 - Merrill et al., 12 Apr 2024) in Introduction, empirical findings paragraph