TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition (2004.07493v4)

Published 16 Apr 2020 in cs.CL, cs.IR, and cs.LG

Abstract: Training neural models for named entity recognition (NER) in a new domain often requires additional human annotations (e.g., tens of thousands of labeled instances) that are usually expensive and time-consuming to collect. Thus, a crucial research question is how to obtain supervision in a cost-effective way. In this paper, we introduce "entity triggers," an effective proxy of human explanations for facilitating label-efficient learning of NER models. An entity trigger is defined as a group of words in a sentence that helps to explain why humans would recognize an entity in the sentence. We crowd-sourced 14k entity triggers for two well-studied NER datasets. Our proposed model, Trigger Matching Network, jointly learns trigger representations and soft matching module with self-attention such that can generalize to unseen sentences easily for tagging. Our framework is significantly more cost-effective than the traditional neural NER frameworks. Experiments show that using only 20% of the trigger-annotated sentences results in a comparable performance as using 70% of conventional annotated sentences.

Citations (83)

View on Semantic Scholar

Summary

The paper introduces entity triggers as explanatory cues that reduce annotation costs while achieving competitive NER performance using only 20% of traditionally labeled data.
It presents a novel Trigger Matching Network that encodes trigger phrases and uses self-attention for semantic alignment between labeled triggers and new sentences.
Experimental evaluations on CoNLL2003 and BC5CDR datasets highlight enhanced data efficiency and model interpretability, promising effective low-resource NER applications.

Analyzing TriggerNER: Entity Triggers for Named Entity Recognition

The paper "TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition" addresses the challenge of reducing the resource-intensive process of annotating datasets for Named Entity Recognition (NER) tasks. It presents an innovative approach by introducing "entity triggers," which are groups of words that serve as explanatory annotations enhancing the learning efficiency of NER models.

Key Contributions and Methodology

The paper distinguishes its approach by crowd-sourcing annotations of "entity triggers" alongside traditional NER annotations. This dual annotation methodology addresses a critical question: how can effective NER models be trained using fewer human-labeled instances? Highlighting the novelty, the authors propose that triggers can act as cues similar to how humans naturally recognize entities in context by specific words or phrases. For instance, in the sentence "He enjoyed his dinner at Olive Garden," the phrase "dinner at" might trigger the identification of "Olive Garden" as a restaurant.

A crucial component of the paper is the introduction of the Trigger Matching Network (TMN). TMN exploits trigger phrases by encoding them and applying a semantic matching mechanism between these encoded triggers and new, unlabeled sentences. This matching leverages self-attention to create a more interpretable and cost-effective framework for NER.

Key components of the TMN include:

Trigger Encoder: Learns representations of triggers based on their association with specific entity types, providing insight into how certain phrases correspond to certain entity categorizations.
Semantic Trigger Matching Module (TrigMatcher): Utilizes self-attention for aligning and matching triggers from labeled data with sentences during the inference phase. It enriches the sentence understanding, allowing the model to generalize entity recognition beyond training examples.
Sequence Tagger: This component incorporates trigger representations via an attention mechanism to enhance entity recognition across sentence sequences more effectively.

The TMN framework undergoes a two-stage training process, combining initial joint training of triggers and matching representations, followed by training a sequence tagger using triggers as attention models.

Experimental Results

The authors conduct empirical evaluations on two widely-used NER datasets, CoNLL2003 and BC5CDR, across different domains—general and biomedical. Results indicate that using only 20% of the trigger-annotated data can achieve performance comparable to using 70% of traditionally annotated data. This demonstrates a notable increase in data-efficiency, underscoring the usefulness of triggers in reducing the dependence on large datasets. Further incorporation of self-training techniques showcases additional performance improvements.

Implications and Future Directions

This paper's approach is significant for both theoretical exploration and practical deployment of NER models in low-resource settings. By effectively leveraging entity triggers, models can considerably reduce the annotation cost associated with traditional NER tasks. Moreover, this framework facilitates model interpretability—offering a window into the decision-making process that underlies model predictions.

Future developments might include:

Automating the extraction or suggestion of entity triggers.
Extending trigger-based methods to multilingual and low-resource languages.
Exploring combination methods with semi-supervised or transfer learning frameworks to further enhance efficiency.

In sum, this research contributes a compelling argument for integrating explanatory triggers in NER to navigate the perennial challenge of resource optimization in language technology. The concept of triggers as derived cues promises an evolution of how contextual recognition of entities can be taught and modeled efficiently.

PDF Markdown

Related Papers

YouTube

Show All Videos