Expressivity of transformers and linear SSMs on state-tracking tasks below permutation composition
Determine whether fixed-depth transformers and generalized linear state-space models, specifically S4 and the S6/Mamba layer, can express solutions to state-tracking problems that are less complex than the S5 permutation composition word problem.
References
We also find that both transformers and SSMs struggle compared to RNNs on state-tracking problems less complex than permutation composition where it is not known whether they can express a solution.
— The Illusion of State in State-Space Models
(2404.08819 - Merrill et al., 12 Apr 2024) in Introduction, empirical findings paragraph