Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Integrating Dependency Tree Into Self-attention for Sentence Representation (2203.05918v3)

Published 11 Mar 2022 in cs.CL and cs.LG

Abstract: Recent progress on parse tree encoder for sentence representation learning is notable. However, these works mainly encode tree structures recursively, which is not conducive to parallelization. On the other hand, these works rarely take into account the labels of arcs in dependency trees. To address both issues, we propose Dependency-Transformer, which applies a relation-attention mechanism that works in concert with the self-attention mechanism. This mechanism aims to encode the dependency and the spatial positional relations between nodes in the dependency tree of sentences. By a score-based method, we successfully inject the syntax information without affecting Transformer's parallelizability. Our model outperforms or is comparable to the state-of-the-art methods on four tasks for sentence representation and has obvious advantages in computational efficiency.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Junhua Ma (3 papers)
  2. Jiajun Li (66 papers)
  3. Yuxuan Liu (97 papers)
  4. Shangbo Zhou (2 papers)
  5. Xue Li (124 papers)
Citations (2)