Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

What do Vision Transformers Learn? A Visual Exploration (2212.06727v1)

Published 13 Dec 2022 in cs.CV

Abstract: Vision transformers (ViTs) are quickly becoming the de-facto architecture for computer vision, yet we understand very little about why they work and what they learn. While existing studies visually analyze the mechanisms of convolutional neural networks, an analogous exploration of ViTs remains challenging. In this paper, we first address the obstacles to performing visualizations on ViTs. Assisted by these solutions, we observe that neurons in ViTs trained with LLM supervision (e.g., CLIP) are activated by semantic concepts rather than visual features. We also explore the underlying differences between ViTs and CNNs, and we find that transformers detect image background features, just like their convolutional counterparts, but their predictions depend far less on high-frequency information. On the other hand, both architecture types behave similarly in the way features progress from abstract patterns in early layers to concrete objects in late layers. In addition, we show that ViTs maintain spatial information in all layers except the final layer. In contrast to previous works, we show that the last layer most likely discards the spatial information and behaves as a learned global pooling operation. Finally, we conduct large-scale visualizations on a wide range of ViT variants, including DeiT, CoaT, ConViT, PiT, Swin, and Twin, to validate the effectiveness of our method.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Amin Ghiasi (11 papers)
  2. Hamid Kazemi (9 papers)
  3. Eitan Borgnia (9 papers)
  4. Steven Reich (9 papers)
  5. Manli Shu (23 papers)
  6. Micah Goldblum (96 papers)
  7. Andrew Gordon Wilson (133 papers)
  8. Tom Goldstein (226 papers)
Citations (51)

Summary

We haven't generated a summary for this paper yet.