Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SANVis: Visual Analytics for Understanding Self-Attention Networks (1909.09595v1)

Published 13 Sep 2019 in cs.CL, cs.LG, and cs.NE

Abstract: Attention networks, a deep neural network architecture inspired by humans' attention mechanism, have seen significant success in image captioning, machine translation, and many other applications. Recently, they have been further evolved into an advanced approach called multi-head self-attention networks, which can encode a set of input vectors, e.g., word vectors in a sentence, into another set of vectors. Such encoding aims at simultaneously capturing diverse syntactic and semantic features within a set, each of which corresponds to a particular attention head, forming altogether multi-head attention. Meanwhile, the increased model complexity prevents users from easily understanding and manipulating the inner workings of models. To tackle the challenges, we present a visual analytics system called SANVis, which helps users understand the behaviors and the characteristics of multi-head self-attention networks. Using a state-of-the-art self-attention model called Transformer, we demonstrate usage scenarios of SANVis in machine translation tasks. Our system is available at http://short.sanvis.org

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Cheonbok Park (20 papers)
  2. Inyoup Na (2 papers)
  3. Yongjang Jo (1 paper)
  4. Sungbok Shin (12 papers)
  5. Jaehyo Yoo (5 papers)
  6. Bum Chul Kwon (24 papers)
  7. Jian Zhao (218 papers)
  8. Hyungjong Noh (5 papers)
  9. Yeonsoo Lee (9 papers)
  10. Jaegul Choo (161 papers)
Citations (34)

Summary

We haven't generated a summary for this paper yet.