Learning to Understand Phrases by Embedding the Dictionary (1504.00548v4)

Published 2 Apr 2015 in cs.CL

Abstract: Distributional models that learn rich semantic word representations are a success story of recent NLP research. However, developing models that learn useful representations of phrases and sentences has proved far harder. We propose using the definitions found in everyday dictionaries as a means of bridging this gap between lexical and phrasal semantics. Neural language embedding models can be effectively trained to map dictionary definitions (phrases) to (lexical) representations of the words defined by those definitions. We present two applications of these architectures: "reverse dictionaries" that return the name of a concept given a definition or description and general-knowledge crossword question answerers. On both tasks, neural language embedding models trained on definitions from a handful of freely-available lexical resources perform as well or better than existing commercial systems that rely on significant task-specific engineering. The results highlight the effectiveness of both neural embedding architectures and definition-based training for developing models that understand phrases and sentences.

Authors (4)

Felix Hill (52 papers)
Kyunghyun Cho (292 papers)
Anna Korhonen (90 papers)
Yoshua Bengio (601 papers)

Citations (189)

View on Semantic Scholar

Summary

Learning to Understand Phrases by Embedding the Dictionary

This paper, authored by Hill, Cho, Korhonen, and Bengio, presents a novel approach to addressing challenges in computational semantics, specifically in the representation of phrases and sentences. Recent advances have focused on the development of distributional models that effectively learn semantic representations of individual words. However, the leap from handling lexical semantics to phrasal semantics has posed significant difficulties.

Methodology

The authors propose using dictionary definitions as a bridge between lexical and phrasal semantics. The core idea is that definitions within dictionaries, such as "a tall, long-necked, spotted ruminant of Africa" for "giraffe", can be leveraged to learn phrase representations that correspond to single-word semantics. The paper introduces models built on neural language embedding architectures that exploit the structured nature of dictionary definitions to map phrases into semantic spaces.

Two primary types of neural LLMs are employed in this paper:

Recurrent Neural Networks (RNNs), known for maintaining sequential order of input words and leveraging structures like Long Short-Term Memory (LSTM) to handle dependencies across longer phrases.
Bag-of-Words (BOW) models, simpler architectures that focus on the collective semantic content of words within a phrase irrespective of their order.

Both models are trained using pre-learned embeddings from the Word2Vec software, with the paper exploring variants where input embeddings are learned from definitions themselves or are pre-trained.

Applications and Results

The research demonstrates significant applications:

Reverse Dictionaries: These models can return words based on input descriptions or definitions, effectively aligning with commercial systems such as OneLook.com. For tasks involving unseen concept descriptions, neural LLMs exhibit comparable or superior performance in terms of retrieval consistency, albeit with some performance variance.
Crossword Question Answering: The models trained on dictionary definitions likewise excel in answering general-knowledge crossword questions by mapping seemingly factual questions to entities within semantic spaces. They performed notably well with longer clues, outperforming commercial crossword-solving engines.

A noteworthy extension is the creation of cross-lingual reverse dictionaries by leveraging bilingual embeddings, exemplifying the flexibility and potential of the proposed models in multilingual settings.

Implications and Future Directions

This approach can be considered a significant stride in simplifying how neural models grasp semantic interpretations beyond the lexical level. The application of dictionary definitions in training NLMs shows promise not only for specific linguistic tools like reverse dictionaries but also lays groundwork for complex question-answering systems.

Future research could focus on several areas:

Integrating question-like linguistic forms into training, potentially improving NLM performance in open-domain QA tasks.
Exploring enhanced architectures combining external memory modules alongside dynamic neural networks, aligning with recent advancements in models that incorporate memory systems.
Delving deeper into understanding BOW models' effectiveness despite lacking structural awareness, potentially refining our understanding of semantic composition.

The paper makes substantial contributions to the field by demonstrating how everyday linguistic resources, like dictionaries, can be transformed into sophisticated tools for semantic learning in AI, paving the way for further exploration of phrase and sentence representation systems.