Analogy between AMICL/residual attention streams and hippocampal skip connections
Establish whether the Associative Memory for In-Context Learning (AMICL) model and the residual attention value stream architecture in Transformers are analogous to skip connections observed in the hippocampus, such as direct CA3-to-CA1 pathways that bypass CA2.
References
Whether our AMICL model or residual attention stream modification can be considered analogous to skip connections witnessed in the hippocampus remains to be studied.
                — Associative memory inspires improvements for in-context learning using a novel attention residual stream architecture
                
                (2412.15113 - Burns et al., 19 Dec 2024) in Section Discussion