AttentionViz: A Global View of Transformer Attention (2305.03210v2)

Published 4 May 2023 in cs.HC, cs.CL, cs.CV, and cs.LG

Abstract: Transformer models are revolutionizing machine learning, but their inner workings remain mysterious. In this work, we present a new visualization technique designed to help researchers understand the self-attention mechanism in transformers that allows these models to learn rich, contextual relationships between elements of a sequence. The main idea behind our method is to visualize a joint embedding of the query and key vectors used by transformer models to compute attention. Unlike previous attention visualization techniques, our approach enables the analysis of global patterns across multiple input sequences. We create an interactive visualization tool, AttentionViz (demo: http://attentionviz.com), based on these joint query-key embeddings, and use it to study attention mechanisms in both language and vision transformers. We demonstrate the utility of our approach in improving model understanding and offering new insights about query-key interactions through several application scenarios and expert feedback.

Authors (6)

Catherine Yeh (6 papers)
Yida Chen (8 papers)
Aoyu Wu (21 papers)
Cynthia Chen (8 papers)
Fernanda Viégas (23 papers)
Martin Wattenberg (39 papers)

Citations (38)

View on Semantic Scholar

Summary

AttentionViz: A Global View of Transformer Attention

The paper presents "AttentionViz," an advanced interactive tool designed to enhance the understanding of self-attention mechanisms within transformer models by offering a novel visualization technique. Transformers have become pivotal in both NLP and computer vision, yet their internal operations remain opaque. This work seeks to demystify these processes, providing researchers with insights into the query-key interactions that underpin transformer architecture.

A core innovation lies in representing query and key vectors in a shared, lower-dimensional joint embedding space, allowing comparative visual analysis across multiple input sequences at once. This contrasts with conventional methods that typically focus on single input instances, utilizing bipartite graphs or heatmaps to illustrate attention scores. Utilizing dimensionality reduction techniques such as t-SNE, UMAP, or PCA, AttentionViz visually encodes query and key relationships within this joint embedding space, hypothesizing that tighter proximities in this space correspond to higher attention values.

The paper details the mathematical basis for this approach, emphasizing the translation invariance of the softmax function, used here to align the centroids of query and key distributions. Another noteworthy technique is the normalization of query and key vectors to better reflect their attention influence, addressing the variations in their norm that could obscure the visualization's interpretability.

AttentionViz enables exploration at multiple scales and levels of detail. The Matrix View permits global pattern recognition across transformer layers and heads, Single View facilitates detailed investigation of individual attention heads, and Sentence/Image View offers fine-grained analysis of attention patterns within specific data. These diverse views ensure comprehensive analysis and support both global assessment and specific inquiry into model behavior.

Several findings emerge from applying AttentionViz to models like BERT, GPT-2, and ViT. In BERT, identifiable visual patterns, such as spirals, are associated with positional attention, supporting the hypothesis that form can infer function. In vision transformers like ViT, specific heads exhibit clustering based on color brightness or spatial frequency, suggesting specialization in detecting low-level visual features.

The research reveals notable anomalies and potentially suboptimal mechanisms within models such as GPT-2, where disparities in query-key norms and prevalent null token attention challenge existing assumptions about transformer optimization and indicate areas for further investigation.

AttentionViz is a significant step toward improving transformer interpretability, providing researchers with a robust framework to visualize and analyze the complexities of self-attention. The tool's capability to surface novel insights and patterns positions it as a valuable resource for further research into transformer models' inner workings. Future improvements could involve real-time data interactions and expanding the application scope to other transformer variants and attention mechanisms, enhancing both its usability and analytical depth.

PDF Markdown

Related Papers

Find Related Papers