Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Acceptability Judgements via Examining the Topology of Attention Maps (2205.09630v2)

Published 19 May 2022 in cs.CL, cs.AI, cs.LG, and math.AT

Abstract: The role of the attention mechanism in encoding linguistic knowledge has received special interest in NLP. However, the ability of the attention heads to judge the grammatical acceptability of a sentence has been underexplored. This paper approaches the paradigm of acceptability judgments with topological data analysis (TDA), showing that the geometric properties of the attention graph can be efficiently exploited for two standard practices in linguistics: binary judgments and linguistic minimal pairs. Topological features enhance the BERT-based acceptability classifier scores by $8$%-$24$% on CoLA in three languages (English, Italian, and Swedish). By revealing the topological discrepancy between attention maps of minimal pairs, we achieve the human-level performance on the BLiMP benchmark, outperforming nine statistical and Transformer LM baselines. At the same time, TDA provides the foundation for analyzing the linguistic functions of attention heads and interpreting the correspondence between the graph features and grammatical phenomena.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Daniil Cherniavskii (7 papers)
  2. Eduard Tulchinskii (10 papers)
  3. Vladislav Mikhailov (31 papers)
  4. Irina Proskurina (5 papers)
  5. Laida Kushnareva (12 papers)
  6. Ekaterina Artemova (53 papers)
  7. Serguei Barannikov (23 papers)
  8. Irina Piontkovskaya (24 papers)
  9. Dmitri Piontkovski (23 papers)
  10. Evgeny Burnaev (189 papers)
Citations (16)