Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Transformer Mechanisms Mimic Frontostriatal Gating Operations When Trained on Human Working Memory Tasks (2402.08211v1)

Published 13 Feb 2024 in cs.AI

Abstract: Models based on the Transformer neural network architecture have seen success on a wide variety of tasks that appear to require complex "cognitive branching" -- or the ability to maintain pursuit of one goal while accomplishing others. In cognitive neuroscience, success on such tasks is thought to rely on sophisticated frontostriatal mechanisms for selective \textit{gating}, which enable role-addressable updating -- and later readout -- of information to and from distinct "addresses" of memory, in the form of clusters of neurons. However, Transformer models have no such mechanisms intentionally built-in. It is thus an open question how Transformers solve such tasks, and whether the mechanisms that emerge to help them to do so bear any resemblance to the gating mechanisms in the human brain. In this work, we analyze the mechanisms that emerge within a vanilla attention-only Transformer trained on a simple sequence modeling task inspired by a task explicitly designed to study working memory gating in computational cognitive neuroscience. We find that, as a result of training, the self-attention mechanism within the Transformer specializes in a way that mirrors the input and output gating mechanisms which were explicitly incorporated into earlier, more biologically-inspired architectures. These results suggest opportunities for future research on computational similarities between modern AI architectures and models of the human brain.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. \APACrefYearMonthDay2012. \BBOQ\APACrefatitleMechanisms of hierarchical reinforcement learning in cortico–striatal circuits 2: Evidence from fMRI Mechanisms of hierarchical reinforcement learning in cortico–striatal circuits 2: Evidence from fmri.\BBCQ \APACjournalVolNumPagesCerebral cortex223527–536. \PrintBackRefs\CurrentBib
  2. \APACrefYearMonthDay2014. \BBOQ\APACrefatitleNeural machine translation by jointly learning to align and translate Neural machine translation by jointly learning to align and translate.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:1409.0473. \PrintBackRefs\CurrentBib
  3. \APACrefYearMonthDay2018. \BBOQ\APACrefatitleLearning and transfer of working memory gating policies Learning and transfer of working memory gating policies.\BBCQ \APACjournalVolNumPagesCognition17289–100. {APACrefURL} https://www.sciencedirect.com/science/article/pii/S0010027717303037 {APACrefDOI} \doihttps://doi.org/10.1016/j.cognition.2017.12.001 \PrintBackRefs\CurrentBib
  4. \APACrefYearMonthDay2020. \BBOQ\APACrefatitleLanguage models are few-shot learners Language models are few-shot learners.\BBCQ \APACjournalVolNumPagesAdvances in neural information processing systems331877–1901. \PrintBackRefs\CurrentBib
  5. \APACrefYearMonthDay2020. \BBOQ\APACrefatitleMemory transformer Memory transformer.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2006.11527. \PrintBackRefs\CurrentBib
  6. \APACrefYearMonthDay2022. \BBOQ\APACrefatitleThunderstruck: The ACDC model of flexible sequences and rhythms in recurrent neural circuits Thunderstruck: The acdc model of flexible sequences and rhythms in recurrent neural circuits.\BBCQ \APACjournalVolNumPagesPLoS Computational Biology182e1009854. \PrintBackRefs\CurrentBib
  7. \APACrefYearMonthDay2014. \BBOQ\APACrefatitleCorticostriatal output gating during selection from working memory Corticostriatal output gating during selection from working memory.\BBCQ \APACjournalVolNumPagesNeuron814930–942. \PrintBackRefs\CurrentBib
  8. \APACrefYearMonthDay2013. \BBOQ\APACrefatitleCognitive control over learning: creating, clustering, and generalizing task-set structure. Cognitive control over learning: creating, clustering, and generalizing task-set structure.\BBCQ \APACjournalVolNumPagesPsychological review1201190. \PrintBackRefs\CurrentBib
  9. \APACrefYearMonthDay2019. \BBOQ\APACrefatitleTransformer-XL: Attentive language models beyond a fixed-length context Transformer-xl: Attentive language models beyond a fixed-length context.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:1901.02860. \PrintBackRefs\CurrentBib
  10. \APACrefYearMonthDay2021. \BBOQ\APACrefatitleVTNet: Visual transformer network for object goal navigation Vtnet: Visual transformer network for object goal navigation.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2105.09447. \PrintBackRefs\CurrentBib
  11. \APACrefYearMonthDay2022. \BBOQ\APACrefatitleToy Models of Superposition Toy models of superposition.\BBCQ \APACjournalVolNumPagesTransformer Circuits Thread. \APACrefnotehttps://transformer-circuits.pub/2022/toy_model/index.html \PrintBackRefs\CurrentBib
  12. \APACrefYearMonthDay2021. \BBOQ\APACrefatitleA mathematical framework for transformer circuits A mathematical framework for transformer circuits.\BBCQ \APACjournalVolNumPagesTransformer Circuits Thread1. \PrintBackRefs\CurrentBib
  13. \APACrefYearMonthDay2012. \BBOQ\APACrefatitleMechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis.\BBCQ \APACjournalVolNumPagesCerebral cortex223509–526. \PrintBackRefs\CurrentBib
  14. \APACrefYearMonthDay2023. \BBOQ\APACrefatitleLocalizing model behavior with path patching Localizing model behavior with path patching.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2304.05969. \PrintBackRefs\CurrentBib
  15. \APACrefYearMonthDay2022. \BBOQ\APACrefatitleLanguage models as zero-shot planners: Extracting actionable knowledge for embodied agents Language models as zero-shot planners: Extracting actionable knowledge for embodied agents.\BBCQ \BIn \APACrefbtitleInternational Conference on Machine Learning International conference on machine learning (\BPGS 9118–9147). \PrintBackRefs\CurrentBib
  16. \APACrefYearMonthDay2013. \BBOQ\APACrefatitleIndirection and symbol-like processing in the prefrontal cortex and basal ganglia Indirection and symbol-like processing in the prefrontal cortex and basal ganglia.\BBCQ \APACjournalVolNumPagesProceedings of the National Academy of Sciences1104116390–16395. \PrintBackRefs\CurrentBib
  17. \APACrefYearMonthDay2022. \BBOQ\APACrefatitleSolving quantitative reasoning problems with language models Solving quantitative reasoning problems with language models.\BBCQ \APACjournalVolNumPagesAdvances in Neural Information Processing Systems353843–3857. \PrintBackRefs\CurrentBib
  18. \APACrefYearMonthDay2023. \BBOQ\APACrefatitleExposing Attention Glitches with Flip-Flop Language Modeling Exposing attention glitches with flip-flop language modeling.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2306.00946. \PrintBackRefs\CurrentBib
  19. \APACrefYearMonthDay2022. \APACrefbtitleTransformerLens. Transformerlens. \APAChowpublishedhttps://github.com/neelnanda-io/TransformerLens. \PrintBackRefs\CurrentBib
  20. \APACinsertmetastarolahblog{APACrefauthors}Olah, C.  \APACrefYearMonthDay2022. \APACrefbtitleMechanistic interpretability, variables, and the importance of interpretable bases. Mechanistic interpretability, variables, and the importance of interpretable bases. \APAChowpublishedhttps://www.transformer-circuits.pub/2022/mech-interp-essay. \PrintBackRefs\CurrentBib
  21. \APACrefYearMonthDay2006. \BBOQ\APACrefatitleMaking working memory work: a computational model of learning in the prefrontal cortex and basal ganglia Making working memory work: a computational model of learning in the prefrontal cortex and basal ganglia.\BBCQ \APACjournalVolNumPagesNeural computation182283–328. \PrintBackRefs\CurrentBib
  22. \APACinsertmetastarpearl2001{APACrefauthors}Pearl, J.  \APACrefYearMonthDay2001. \BBOQ\APACrefatitleDirect and indirect effects Direct and indirect effects.\BBCQ \BIn \APACrefbtitleProceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence Proceedings of the seventeenth conference on uncertainty in artificial intelligence (\BPG 411–420). \APACaddressPublisherSan Francisco, CA, USAMorgan Kaufmann Publishers Inc. \PrintBackRefs\CurrentBib
  23. \APACrefYearMonthDay2022. \BBOQ\APACrefatitleGrokking: Generalization beyond overfitting on small algorithmic datasets Grokking: Generalization beyond overfitting on small algorithmic datasets.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2201.02177. \PrintBackRefs\CurrentBib
  24. \APACrefYearMonthDay2021. \BBOQ\APACrefatitleAnalogous computations in working memory input, output and motor gating: Electrophysiological and computational modeling evidence Analogous computations in working memory input, output and motor gating: Electrophysiological and computational modeling evidence.\BBCQ \APACjournalVolNumPagesPLoS computational biology176e1008971. \PrintBackRefs\CurrentBib
  25. \APACrefYearMonthDay2016. \BBOQ\APACrefatitleDissociating working memory updating and automatic updating: The reference-back paradigm. Dissociating working memory updating and automatic updating: The reference-back paradigm.\BBCQ \APACjournalVolNumPagesJournal of Experimental Psychology: Learning, Memory, and Cognition426951. \PrintBackRefs\CurrentBib
  26. \APACrefYearMonthDay2023. \BBOQ\APACrefatitleToward transparent ai: A survey on interpreting the inner structures of deep neural networks Toward transparent ai: A survey on interpreting the inner structures of deep neural networks.\BBCQ \BIn \APACrefbtitle2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML) 2023 ieee conference on secure and trustworthy machine learning (satml) (\BPGS 464–483). \PrintBackRefs\CurrentBib
  27. \APACrefYearMonthDay2023. \BBOQ\APACrefatitleLlama: Open and efficient foundation language models Llama: Open and efficient foundation language models.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2302.13971. \PrintBackRefs\CurrentBib
  28. \APACrefYearMonthDay2017. \BBOQ\APACrefatitleAttention is all you need Attention is all you need.\BBCQ \APACjournalVolNumPagesAdvances in neural information processing systems30. \PrintBackRefs\CurrentBib
  29. \APACrefYearMonthDay2022. \BBOQ\APACrefatitleInterpretability in the wild: a circuit for indirect object identification in gpt-2 small Interpretability in the wild: a circuit for indirect object identification in gpt-2 small.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2211.00593. \PrintBackRefs\CurrentBib
  30. \APACrefYearMonthDay2019. \BBOQ\APACrefatitleTree transformer: Integrating tree structures into self-attention Tree transformer: Integrating tree structures into self-attention.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:1909.06639. \PrintBackRefs\CurrentBib
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Aaron Traylor (3 papers)
  2. Jack Merullo (15 papers)
  3. Michael J. Frank (6 papers)
  4. Ellie Pavlick (66 papers)
Citations (3)
X Twitter Logo Streamline Icon: https://streamlinehq.com