Function Vectors in Large Language Models (2310.15213v2)

Published 23 Oct 2023 in cs.CL and cs.LG

Abstract: We report the presence of a simple neural mechanism that represents an input-output function as a vector within autoregressive transformer LMs. Using causal mediation analysis on a diverse range of in-context-learning (ICL) tasks, we find that a small number attention heads transport a compact representation of the demonstrated task, which we call a function vector (FV). FVs are robust to changes in context, i.e., they trigger execution of the task on inputs such as zero-shot and natural text settings that do not resemble the ICL contexts from which they are collected. We test FVs across a range of tasks, models, and layers and find strong causal effects across settings in middle layers. We investigate the internal structure of FVs and find while that they often contain information that encodes the output space of the function, this information alone is not sufficient to reconstruct an FV. Finally, we test semantic vector composition in FVs, and find that to some extent they can be summed to create vectors that trigger new complex tasks. Our findings show that compact, causal internal vector representations of function abstractions can be explicitly extracted from LLMs. Our code and data are available at https://functions.baulab.info.

References (57)

Citations (67)

View on Semantic Scholar

Collections

Sign up for free to add this paper to one or more collections.

Sign Up

Summary

The paper reveals that a small subset of transformer attention heads generate function vectors that encode abstract task representations via causal mediation analysis.
The study quantifies these vectors predominantly in middle layers, demonstrating their role in zero-shot task execution and potential for function composition.
The findings challenge traditional token-based embeddings, suggesting new directions for enhancing LLM interpretability and architectural design.

Function Vectors in LLMs: A Critical Examination

The paper "Function Vectors in LLMs" presents an intriguing analysis of the behavior of autoregressive transformer LMs by identifying and exploring the concept of function vectors (FVs). These vectors provide a compact representation that encapsulates the input-output task demonstrated within a model's in-context learning (ICL). By leveraging causal mediation analysis, the authors meticulously decipher how specific attention heads within transformers are responsible for transporting these function vectors, ultimately enabling the model to trigger appropriate task execution.

Core Findings

Central to the paper is the discovery that function vectors are realized as a result of the model's inherent ability to employ a small subset of attention heads to transport task representations. The authors quantify this phenomenon using causal mediation analysis across various tasks and layers of the model. Robust evidence of function vectors was observed predominantly in the middle layers of the architecture. These vectors represent abstract tasks that trigger the execution of functions in zero-shot or contextually diverse settings.

The authors further explore understanding the internal structure of these FVs. Analytical decoding of these vectors suggests that, despite decomposing to reveal output-oriented vocabulary, the encapsulated knowledge goes beyond simple word distributions. This revelation subverts traditional notions of contextual embedding or token-based vector offsets prevalent in classical machine learning literatures—indicating a deeper, nontrivial representation of task abstraction.

Methodological Approach

The paper employs causal mediation analysis to unravel the role of function vectors. This involves computing task-conditioned mean activations for attention heads, enabling the exploration of their causal impact across different settings. This approach highlights the importance of attention heads in intermediary layers for carrying task representations—essentially acting as conduits for the function vectors.

Numerous tasks, including verb tense transformations, translation, and question answering, were used to test the portability and efficacy of function vectors. The hypothesis that these vectors can be summed to induce new task behavior was explicitly tested, providing a compelling insight into their potential for function composition.

Implications and Future Directions

The concept of function vectors reflects a conceptual leap in understanding the mechanics of LLMs. This discovery suggests LMs not only capture semantic-rich embeddings but also embed task-specific functional mappings. Practically, this implies promising pathways to enhance model interpretability and manipulation by targeting these vectors.

Additionally, the insights gained could influence future architectural designs or training paradigms of transformer models, highlighting the need for more explicit facilitation of such vector-based function representations. It opens doors for more sophisticated models capable of efficient multitasking or the creation of more refined compositional AI systems.

In summary, the paper offers a detailed exposition of a novel aspect of transformer behavior. It extends beyond mere word association, exploring the critical role of function vectors in realizing complex task behaviors. While advancing the scientific discussion of transformer models, it also hints at broader applications in AI research and beyond, warranting further exploration and empirical validation.

PDF Markdown

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Related Papers

Authors (6)

Tweets

https://twitter.com/cem__anil/status/1775282571070591220

https://twitter.com/BogdanIonutCir2/status/1744020826528141416

https://twitter.com/hagsaeng_bag/status/1776572606788907423

https://twitter.com/brianryhuang/status/1769837986135359978

https://twitter.com/taliesinb/status/1793004076080214075

https://twitter.com/peterbhase/status/1747655698362630158

YouTube

Show All Videos