A General Framework for Information Extraction using Dynamic Span Graphs (1904.03296v1)

Published 5 Apr 2019 in cs.CL

Abstract: We introduce a general framework for several information extraction tasks that share span representations using dynamically constructed span graphs. The graphs are constructed by selecting the most confident entity spans and linking these nodes with confidence-weighted relation types and coreferences. The dynamic span graph allows coreference and relation type confidences to propagate through the graph to iteratively refine the span representations. This is unlike previous multi-task frameworks for information extraction in which the only interaction between tasks is in the shared first-layer LSTM. Our framework significantly outperforms the state-of-the-art on multiple information extraction tasks across multiple datasets reflecting different domains. We further observe that the span enumeration approach is good at detecting nested span entities, with significant F1 score improvement on the ACE dataset.

Citations (309)

View on Semantic Scholar

Summary

The paper introduces a dynamic span graph framework that iteratively refines span representations for enhanced information extraction.
The methodology combines contextual token embeddings with gated graph propagation to effectively capture coreference and relational links.
Empirical results demonstrate significant F1 score improvements on datasets like ACE and SciERC, establishing state-of-the-art performance across IE tasks.

Overview of the Dynamic Span Graph Framework for Information Extraction

The paper under examination introduces a general information extraction (IE) framework that enhances the performance of multiple IE tasks through the use of dynamic span graphs. This framework notably improves upon traditional pipelines and joint models by enabling refined span representations via contextualized information propagation, thereby significantly outperforming state-of-the-art systems across various domains. The framework's utility is thoroughly evaluated on entity recognition, relation extraction, and overlapping entity detection tasks.

Contribution and Methodology

The core innovation of the described framework lies in the employment of dynamic span graphs. These graphs dynamically construct and update span representations by propagating information through coreference and relation links, which are weighted by confidence scores. This propagation occurs in an iterative manner, resulting in richer contextual embeddings that inform task-specific predictions.

Key components of the framework include:

Token Representation Layer: Utilizes a BiLSTM to generate contextual token embeddings augmented with character embeddings, GloVe, and ELMo embeddings.
Span Representation Layer: Forms initial span representations by concatenating boundary token states, a soft attention-based headword representation, and span width features.
Coreference and Relation Propagation Layers: Engage in dynamic graph-based updates where span representations are iteratively refined using context from antecedents or relationally connected spans.
Dynamic Graph Construction: Constructs graphs with spans as nodes, adjusting edge weights according to coreference and relation confidences, facilitating information flow through a gating mechanism.
Multi-task Layer: Final predictions for entities, relations, and coreferences are made using the refined span embeddings.

Empirical Results

The framework has been rigorously tested on datasets like ACE2004/2005, SciERC, and GENIA, covering domains such as news, AI, and biomedical literature. The results indicate substantial performance gains:

Achieved relative improvements in F1 scores, such as a 25.8% gain in relation extraction on the ACE04 dataset.
Demonstrated capabilities in overlapping entity detection, with improvements up to 11.6% F1 on ACE04-O.
Established state-of-the-art performance in redundant IE tasks without relying on domain-specific syntactic tools.

Implications and Future Directions

Practically, the incorporation of dynamic span graphs can lead to more robust and flexible IE systems usable across diverse datasets without customization of external tools for domain adaptation. Theoretically, this approach underscores the potential benefits of utilizing global context and refined task interaction in multi-task settings.

Beyond its immediate applications, dynamic span graph methodology could be extended to complex IE tasks, such as event extraction. Moreover, future lines of research might explore the adaptation of this approach in low-resource settings or its integration with other forms of LLMing tasks, potentially broadening the scope and impact of dynamic span graphs in natural language processing.

In conclusion, this paper presents a significant advancement in the field of information extraction by innovating on the computational methodology to deliver improved interaction between related IE tasks, thereby setting a new benchmark for performance in this area.