Emergent In-Context Learning in State-Space Models
Determine whether in-context learning is an emergent capability in state-space models such as Mamba, comparable to the emergent in-context learning observed in Transformer language models.
Sponsor
References
While Mamba may learn to copy and perform simple ICL when explicitly trained to do so (, it is not clear if ICL is an emergent capability in SSM as is typical of Transformer models.
— Jamba: A Hybrid Transformer-Mamba Language Model
(2403.19887 - Lieber et al., 28 Mar 2024) in Section 5.2 (Why does the Combination Work?)