Attention Guided Graph Convolutional Networks for Relation Extraction (1906.07510v8)

Published 18 Jun 2019 in cs.CL and cs.LG

Abstract: Dependency trees convey rich structural information that is proven useful for extracting relations among entities in text. However, how to effectively make use of relevant information while ignoring irrelevant information from the dependency trees remains a challenging research question. Existing approaches employing rule based hard-pruning strategies for selecting relevant partial dependency structures may not always yield optimal results. In this work, we propose Attention Guided Graph Convolutional Networks (AGGCNs), a novel model which directly takes full dependency trees as inputs. Our model can be understood as a soft-pruning approach that automatically learns how to selectively attend to the relevant sub-structures useful for the relation extraction task. Extensive results on various tasks including cross-sentence n-ary relation extraction and large-scale sentence-level relation extraction show that our model is able to better leverage the structural information of the full dependency trees, giving significantly better results than previous approaches.

Citations (388)

View on Semantic Scholar

Summary

The paper introduces AGGCNs, a novel approach that employs self-attention for dynamic soft-pruning of dependency graphs to preserve complete syntactic structures.
It demonstrates significant improvements on cross-sentence n-ary and TACRED relation extraction benchmarks, outperforming state-of-the-art models by up to 8% in accuracy.
The integration of dense connectivity and a linear combination layer enhances multi-layer information flow, enabling robust analysis of complex linguistic relations.

An Overview of Attention Guided Graph Convolutional Networks for Relation Extraction

The paper "Attention Guided Graph Convolutional Networks for Relation Extraction" presents a sophisticated method for improving relation extraction tasks through the introduction of Attention Guided Graph Convolutional Networks (AGGCNs). This approach addresses the limitations of traditional hard-pruning strategies used in dependency-based models by adopting a soft-pruning method, which dynamically learns to attend to relevant substructures within dependency trees, thereby enhancing the extraction of relational data.

Background and Motivation

Relation extraction, a critical task within natural language processing, involves identifying relationships among entities within a text. The use of dependency trees, which highlight syntactic relations between words, has been a traditional method providing structural insights beyond simple sequential models. However, previous dependency-based approaches, which utilize rule-based hard-pruning of dependency trees, often risk discarding valuable information that could be crucial for effective relation extraction.

To counter this challenge, the paper introduces AGGCNs—an innovative graph neural network architecture that can exploit the complete structural information of dependency trees. This model eschews the need for predefined pruning strategies by employing a self-attention mechanism to weight the relevance of each node and edge in a graph, thereby retaining complete syntactic structures while selectively focusing on the most relevant components.

Methodology

AGGCNs distinguish themselves from traditional Graph Convolutional Networks (GCNs) in multiple ways:

Self-Attention Guided Learning: The AGGCN model replaces the static graph connections with a dynamic, attention-based reconfiguration. Through multi-head attention, the model derives edge weights that reflect the importance of relationships between nodes within the graph, allowing for adaptive 'soft-pruning' of the graph structure.
Dense Connections: Inspired by DenseNet architectures, the model incorporates dense connections to facilitate information flow across multiple layers. This design aims to capture both local and non-local interactions more effectively, thereby enhancing the depth of learning without the drawback of information loss typically seen in deep networks.
Linear Combination Layer: To aggregate the outputs from multiple densely connected layers, the model uses a linear combination layer, which integrates and transforms multi-head attentional data into a cohesive output suitable for relational analysis.

Results and Contributions

The paper provides extensive evaluations on two key tasks—cross-sentence $n$ -ary relation extraction and large-scale sentence-level relation extraction—demonstrating the efficacy of AGGCNs:

For the cross-sentence $n$ -ary relation extraction, AGGCNs outperform previous state-of-the-art models by significant margins, achieving improvements of up to 8% in accuracy for ternary relations.
The model also shows competitive performance on the TACRED dataset, a benchmark for sentence-level relation extraction, achieving higher F1 scores than existing models, evidencing its robustness across different data conditions and pruning settings.

These outcomes highlight AGGCNs' superior ability to learn from complete syntactic structures and draw meaningful inferences for relation extraction tasks without incurring additional computational costs.

Implications and Future Directions

The introduction of AGGCNs signifies a notable advancement in leveraging dependency structures for NLP applications. By allowing the model to self-determine relevant data segments through attention-guided soft-pruning and dense connectivity, AGGCNs extend the applicability of dependency-based approaches to broader and more complex linguistic relations, including cross-sentence scenarios.

Future research could explore extending AGGCNs' application beyond relation extraction to other graph-based problems in NLP, such as coreference resolution or discourse parsing. Additionally, integrating AGGCNs with evolving self-supervised LLMs could further enhance contextual understanding and representation learning in sparse data environments.

PDF Markdown