Dependency Transformer Grammars: Integrating Dependency Structures into Transformer Language Models (2407.17406v1)

Published 24 Jul 2024 in cs.CL and cs.AI

Abstract: Syntactic Transformer LLMs aim to achieve better generalization through simultaneously modeling syntax trees and sentences. While prior work has been focusing on adding constituency-based structures to Transformers, we introduce Dependency Transformer Grammars (DTGs), a new class of Transformer LLM with explicit dependency-based inductive bias. DTGs simulate dependency transition systems with constrained attention patterns by modifying attention masks, incorporate the stack information through relative positional encoding, and augment dependency arc representation with a combination of token embeddings and operation embeddings. When trained on a dataset of sentences annotated with dependency trees, DTGs achieve better generalization while maintaining comparable perplexity with Transformer LLM baselines. DTGs also outperform recent constituency-based models, showing that dependency can better guide Transformer LLMs. Our code is released at https://github.com/zhaoyd1/Dep_Transformer_Grammars.

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

Authors (3)

Yida Zhao (12 papers)
Chao Lou (8 papers)
Kewei Tu (74 papers)

GitHub

GitHub - zhaoyd1/Dep_Transformer_Grammars (10 stars)

Dependency Transformer Grammars: Integrating Dependency Structures into Transformer Language Models (2407.17406v1)

Related Papers

GitHub

Tweets