Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 97 tok/s
Gemini 2.5 Pro 50 tok/s Pro
GPT-5 Medium 37 tok/s
GPT-5 High 28 tok/s Pro
GPT-4o 110 tok/s
GPT OSS 120B 468 tok/s Pro
Kimi K2 236 tok/s Pro
2000 character limit reached

Function Vectors in Large Language Models (2310.15213v2)

Published 23 Oct 2023 in cs.CL and cs.LG

Abstract: We report the presence of a simple neural mechanism that represents an input-output function as a vector within autoregressive transformer LMs. Using causal mediation analysis on a diverse range of in-context-learning (ICL) tasks, we find that a small number attention heads transport a compact representation of the demonstrated task, which we call a function vector (FV). FVs are robust to changes in context, i.e., they trigger execution of the task on inputs such as zero-shot and natural text settings that do not resemble the ICL contexts from which they are collected. We test FVs across a range of tasks, models, and layers and find strong causal effects across settings in middle layers. We investigate the internal structure of FVs and find while that they often contain information that encodes the output space of the function, this information alone is not sufficient to reconstruct an FV. Finally, we test semantic vector composition in FVs, and find that to some extent they can be summed to create vectors that trigger new complex tasks. Our findings show that compact, causal internal vector representations of function abstractions can be explicitly extracted from LLMs. Our code and data are available at https://functions.baulab.info.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. What learning algorithm is in-context learning? investigations with linear models. arXiv preprint arXiv:2211.15661, 2022.
  2. Eliciting latent predictions from transformers with the tuned lens. arXiv preprint arXiv:2303.08112, 2023.
  3. GPT-NeoX-20B: An open-source autoregressive language model. In Proceedings of the ACL Workshop on Challenges & Perspectives in Creating Large Language Models, 2022. URL https://arxiv.org/abs/2204.06745.
  4. Language models are few-shot learners. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (eds.), Advances in Neural Information Processing Systems, volume 33, pp.  1877–1901. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.
  5. Alonzo Church. An unsolvable problem of elementary number theory. American journal of mathematics, 58:345–373, 1936.
  6. Word translation without parallel data. arXiv preprint arXiv:1710.04087, 2017.
  7. Why can gpt learn in-context? language models secretly perform gradient descent as meta optimizers. arXiv preprint arXiv:2212.10559, 2022.
  8. Analyzing transformers in embedding space. arXiv preprint arXiv:2209.02535, 2022.
  9. A mathematical framework for transformer circuits. Transformer Circuits Thread, 2021.
  10. What can transformers learn in-context? a case study of simple function classes. Advances in Neural Information Processing Systems, 35:30583–30598, 2022.
  11. Transformer feed-forward layers are key-value memories. arXiv preprint arXiv:2012.14913, 2020.
  12. Transformer feed-forward layers build predictions by promoting concepts in the vocabulary space. arXiv preprint arXiv:2203.14680, 2022.
  13. Dissecting recall of factual associations in auto-regressive language models. arXiv preprint arXiv:2304.14767, 2023.
  14. A theory of emergent in-context learning as implicit structure induction. arXiv preprint arXiv:2303.07971, 2023.
  15. Overthinking the truth: Understanding how language models process false demonstrations. arXiv preprint arXiv:2307.09476, 2023.
  16. In-context learning of large language models explained as kernel regression. arXiv preprint arXiv:2305.12766, 2023.
  17. Linearity of relation decoding in transformer language models. arXiv preprint arXiv:2308.09124, 2023.
  18. Instruction induction: From few examples to natural language task descriptions, 2022.
  19. Editing models with task arithmetic. In The Eleventh International Conference on Learning Representations, 2023.
  20. Ground-truth labels matter: A deeper look into input-label demonstrations. arXiv preprint arXiv:2205.12685, 2022.
  21. Dependency-based word embeddings. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp.  302–308, 2014.
  22. Transformers as algorithms: Generalization and stability in in-context learning. Proceedings of the 40th International Conference on Machine Learning, 2023.
  23. Locating and editing factual associations in gpt. Advances in Neural Information Processing Systems, 35:17359–17372, 2022a.
  24. Mass-editing memory in a transformer. arXiv preprint arXiv:2210.07229, 2022b.
  25. Language models implement simple word2vec-style vector arithmetic. arXiv preprint arXiv:2305.16130, 2023.
  26. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26, 2013.
  27. Rethinking the role of demonstrations: What makes in-context learning work? arXiv preprint arXiv:2202.12837, 2022.
  28. Learning to compress prompts with gist tokens. arXiv preprint arXiv:2304.08467, 2023.
  29. Distinguishing antonyms and synonyms in a pattern-based neural network. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pp.  76–85, Valencia, Spain, April 2017. Association for Computational Linguistics. URL https://aclanthology.org/E17-1008.
  30. Nostalgebraist. Interpreting GPT: The logit lens. URL https://www.lesswrong.com/posts/AcKRB8wDpdaN6v6ru/interpreting-gpt-the-logit-lens, 2020.
  31. In-context learning and induction heads. arXiv preprint arXiv:2209.11895, 2022.
  32. OpenAI. Gpt-4 technical report, 2023.
  33. What in-context learning” learns” in-context: Disentangling task recognition and task learning. arXiv preprint arXiv:2305.09731, 2023.
  34. Task-specific skill localization in fine-tuned language models. arXiv preprint arXiv:2302.06600, 2023.
  35. Judea Pearl. Direct and indirect effects. In Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence, 2001, pp.  411–420. Morgan Kaufman, 2001.
  36. Prompt programming for large language models: Beyond the few-shot paradigm. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, pp.  1–7, 2021.
  37. Erik Tjong Kim Sang and Fien De Meulder. Introduction to the conll-2003 shared task: Language-independent named entity recognition. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pp.  142–147, 2003.
  38. Compositional task representations for large language models. In The Eleventh International Conference on Learning Representations, 2023.
  39. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp.  1631–1642, Seattle, Washington, USA, October 2013. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/D13-1170.
  40. Gerald Jay Sussman. Scheme: an interpreter for extended lambda calculus. MIT AI Memo, 1975.
  41. Commonsenseqa: A question answering challenge targeting commonsense knowledge. arXiv preprint arXiv:1811.00937, 2018.
  42. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
  43. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  44. Investigating gender bias in language models using causal mediation analysis. Advances in neural information processing systems, 33:12388–12401, 2020.
  45. Transformers learn in-context by gradient descent. arXiv preprint arXiv:2212.07677, 2022.
  46. GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax, May 2021.
  47. Interpretability in the wild: a circuit for indirect object identification in gpt-2 small. arXiv preprint arXiv:2211.00593, 2022a.
  48. Label words are anchors: An information flow perspective for understanding in-context learning. arXiv preprint arXiv:2305.14160, 2023a.
  49. Finding skill neurons in pre-trained transformer-based language models. arXiv preprint arXiv:2211.07349, 2022b.
  50. Investigating the learning behaviour of in-context learning: A comparison with supervised learning. arXiv preprint arXiv:2307.15411, 2023b.
  51. Large language models are implicitly topic models: Explaining and finding good demonstrations for in-context learning. arXiv preprint arXiv:2301.11916, 2023c.
  52. Larger language models do in-context learning differently. arXiv preprint arXiv:2303.03846, 2023.
  53. The learnability of in-context learning. arXiv preprint arXiv:2303.07895, 2023.
  54. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp.  38–45, Online, October 2020. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/2020.emnlp-demos.6.
  55. An explanation of in-context learning as implicit bayesian inference. arXiv preprint arXiv:2111.02080, 2021.
  56. Character-level convolutional networks for text classification. Advances in neural information processing systems, 28, 2015.
  57. What and how does in-context learning learn? bayesian model averaging, parameterization, and generalization. arXiv preprint arXiv:2305.19420, 2023.
Citations (67)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

  • The paper reveals that a small subset of transformer attention heads generate function vectors that encode abstract task representations via causal mediation analysis.
  • The study quantifies these vectors predominantly in middle layers, demonstrating their role in zero-shot task execution and potential for function composition.
  • The findings challenge traditional token-based embeddings, suggesting new directions for enhancing LLM interpretability and architectural design.

Function Vectors in LLMs: A Critical Examination

The paper "Function Vectors in LLMs" presents an intriguing analysis of the behavior of autoregressive transformer LMs by identifying and exploring the concept of function vectors (FVs). These vectors provide a compact representation that encapsulates the input-output task demonstrated within a model's in-context learning (ICL). By leveraging causal mediation analysis, the authors meticulously decipher how specific attention heads within transformers are responsible for transporting these function vectors, ultimately enabling the model to trigger appropriate task execution.

Core Findings

Central to the paper is the discovery that function vectors are realized as a result of the model's inherent ability to employ a small subset of attention heads to transport task representations. The authors quantify this phenomenon using causal mediation analysis across various tasks and layers of the model. Robust evidence of function vectors was observed predominantly in the middle layers of the architecture. These vectors represent abstract tasks that trigger the execution of functions in zero-shot or contextually diverse settings.

The authors further explore understanding the internal structure of these FVs. Analytical decoding of these vectors suggests that, despite decomposing to reveal output-oriented vocabulary, the encapsulated knowledge goes beyond simple word distributions. This revelation subverts traditional notions of contextual embedding or token-based vector offsets prevalent in classical machine learning literatures—indicating a deeper, nontrivial representation of task abstraction.

Methodological Approach

The paper employs causal mediation analysis to unravel the role of function vectors. This involves computing task-conditioned mean activations for attention heads, enabling the exploration of their causal impact across different settings. This approach highlights the importance of attention heads in intermediary layers for carrying task representations—essentially acting as conduits for the function vectors.

Numerous tasks, including verb tense transformations, translation, and question answering, were used to test the portability and efficacy of function vectors. The hypothesis that these vectors can be summed to induce new task behavior was explicitly tested, providing a compelling insight into their potential for function composition.

Implications and Future Directions

The concept of function vectors reflects a conceptual leap in understanding the mechanics of LLMs. This discovery suggests LMs not only capture semantic-rich embeddings but also embed task-specific functional mappings. Practically, this implies promising pathways to enhance model interpretability and manipulation by targeting these vectors.

Additionally, the insights gained could influence future architectural designs or training paradigms of transformer models, highlighting the need for more explicit facilitation of such vector-based function representations. It opens doors for more sophisticated models capable of efficient multitasking or the creation of more refined compositional AI systems.

In summary, the paper offers a detailed exposition of a novel aspect of transformer behavior. It extends beyond mere word association, exploring the critical role of function vectors in realizing complex task behaviors. While advancing the scientific discussion of transformer models, it also hints at broader applications in AI research and beyond, warranting further exploration and empirical validation.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Youtube Logo Streamline Icon: https://streamlinehq.com