Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation (1601.01343v4)

Published 6 Jan 2016 in cs.CL

Abstract: Named Entity Disambiguation (NED) refers to the task of resolving multiple named entity mentions in a document to their correct references in a knowledge base (KB) (e.g., Wikipedia). In this paper, we propose a novel embedding method specifically designed for NED. The proposed method jointly maps words and entities into the same continuous vector space. We extend the skip-gram model by using two models. The KB graph model learns the relatedness of entities using the link structure of the KB, whereas the anchor context model aims to align vectors such that similar words and entities occur close to one another in the vector space by leveraging KB anchors and their context words. By combining contexts based on the proposed embedding with standard NED features, we achieved state-of-the-art accuracy of 93.1% on the standard CoNLL dataset and 85.2% on the TAC 2010 dataset.

Authors (4)

Ikuya Yamada (22 papers)
Hiroyuki Shindo (21 papers)
Hideaki Takeda (14 papers)
Yoshiyasu Takefuji (10 papers)

Citations (316)

View on Semantic Scholar

Summary

The paper introduces a novel joint learning approach to combine word and entity embeddings for improved Named Entity Disambiguation.
It extends the skip-gram model with a KB Graph and an Anchor Context model to optimize both contextual and relational similarity.
Experimental results demonstrate state-of-the-art performance with accuracies of 93.1% on CoNLL and 85.2% on TAC 2010 datasets.

Joint Learning of Embedding of Words and Entities for Named Entity Disambiguation

The paper, authored by Ikuya Yamada, Hideaki Takeda, and Yoshiyasu Takefuji, presents a sophisticated method for Named Entity Disambiguation (NED) by simultaneously learning embeddings for words and entities. NED, a pivotal task in natural language processing, involves detecting the correct entity reference in a document from a knowledge base, such as Wikipedia. This research introduces an innovative embedding technique that resolves ambiguity by optimizing a joint vector space representation for words and entities.

Methodology

This research extends the well-regarded skip-gram model for word embeddings by integrating two additional models:

KB Graph Model: This model aligns entities based on their relatedness as determined by the link structure within a knowledge base. The approach benefits from the Wikipedia Link-based Measure (WLM) to compute entity relatedness, placing entities with similar links close to each other in the vector space.
Anchor Context Model: This model incorporates Wikipedia anchors and context words, aligning the vector space to reduce the gap between entities and their related words. This allows word and entity vectors to interact, thus enhancing contextual similarity measurements.

The proposed method optimizes these models collectively using a negative sampling technique, efficiently scaling the vector space learning over large datasets.

Experimental Results

The contribution demonstrates significant numerical improvements in NED. Highlighted results include achieving state-of-the-art accuracy rates of 93.1% and 85.2% on the CoNLL and TAC 2010 datasets, respectively. An experiment on entity relatedness was conducted using a benchmark dataset, verifying the superior performance of the learned embeddings compared to standard methods. The entity relatedness scores, both in terms of NDCG and MAP, confirm the utility of the embeddings in modeling contextual cohesion and relatedness.

Implications and Future Work

This approach facilitates the advancement of NED methodologies by efficiently modeling both the textual context and inter-entity coherence. The joint embedding system provides a compelling framework for further investigations into large-scale text mining, semantic search, and information extraction applications. Importantly, the methodology can extend to other domains of AI and natural language processing, such as co-reference resolution and knowledge base population.

Future work involves exploiting additional hierarchical and relational knowledge from other knowledge graphs, such as Freebase, to further refine the embeddings. There is also potential to explore this joint vector representation in tasks beyond NED, reflecting on its versatility and adaptability in handling semantic tasks across varied contexts.

The research contributes a substantial improvement in understanding and utilizing contextual knowledge, offering a robust foundation for tackling the complexities inherent in natural language ambiguity.

PDF Markdown