Relation Extraction as Open-book Examination: Retrieval-enhanced Prompt Tuning (2205.02355v2)

Published 4 May 2022 in cs.CL, cs.AI, cs.IR, and cs.LG

Abstract: Pre-trained LLMs have contributed significantly to relation extraction by demonstrating remarkable few-shot learning abilities. However, prompt tuning methods for relation extraction may still fail to generalize to those rare or hard patterns. Note that the previous parametric learning paradigm can be viewed as memorization regarding training data as a book and inference as the close-book test. Those long-tailed or hard patterns can hardly be memorized in parameters given few-shot instances. To this end, we regard RE as an open-book examination and propose a new semiparametric paradigm of retrieval-enhanced prompt tuning for relation extraction. We construct an open-book datastore for retrieval regarding prompt-based instance representations and corresponding relation labels as memorized key-value pairs. During inference, the model can infer relations by linearly interpolating the base output of PLM with the non-parametric nearest neighbor distribution over the datastore. In this way, our model not only infers relation through knowledge stored in the weights during training but also assists decision-making by unwinding and querying examples in the open-book datastore. Extensive experiments on benchmark datasets show that our method can achieve state-of-the-art in both standard supervised and few-shot settings. Code are available in https://github.com/zjunlp/PromptKG/tree/main/research/RetrievalRE.

PDF HTML Abstract

Relation Extraction as Open-book Examination: Retrieval-enhanced Prompt Tuning

This paper presents a novel approach to relation extraction (RE) by treating it as an open-book examination, enhancing pre-trained LLMs (PLMs) through retrieval-based prompt tuning. The authors introduce RetrievalRE, a semi-parametric paradigm that leverages retrieval-enhanced prompt tuning, aiming to improve the model's ability to generalize, especially on long-tailed or complex examples.

Methodology

RetrievalRE fundamentally deviates from conventional parametric approaches, which resemble close-book tests, by incorporating a retrieval mechanism that accesses an external datastore. This open-book datastore comprises prompt-based instance representations paired with relation labels.

Prompt Tuning

The paper builds upon the current state of prompt tuning methodologies. Prompt tuning reformulates RE tasks into cloze-style tasks by using task-specific templates and verbalizers. The PLMs use these templates to fill in masked tokens based on the input texts, thereby predicting relation labels.

Open-book Datastore and Retrieval

A key innovation is the open-book datastore, constructed using embeddings derived from the prompty-based instance representation of training samples. During inference, RetrievalRE queries this datastore to retrieve the k-nearest examples, thereby enriching the decision-making process.

The approach integrates retrieved instances by interpolating the model's baseline output with a non-parametric nearest neighbor distribution. This interpolation is controlled by a parameter that balances learned parameters with external retrieved knowledge.

Experimental Results

RetrievalRE demonstrates substantial improvements over baseline RE models across various datasets including SemEval, TACRED, and others. It achieves state-of-the-art performance in both standard supervised and few-shot scenarios. In low-resource settings, RetrievalRE significantly outperforms existing models, offering higher F1 scores while maintaining lower variance. This suggests robustness to data scarcity, which is crucial for handling rare patterns in RE tasks.

Analysis and Implications

The proposed method shows that retrieval-augmented prompt tuning can robustly handle scarce data and complex instances, stripping reliance solely on memorization within model parameters. This approach could potentially redefine training strategies, offering a way to continuously update model knowledge bases without retraining from scratch.

While the method introduces additional computational overhead during retrieval, the use of advanced algorithms like FAISS mitigates this to acceptable levels.

Future Directions

The paper paves the way for methodologies incorporating explicit memory in PLMs through semi-parametric approaches. This has broad implications for enhancing AI systems, making them more adaptable and less reliant on extensive retraining when new data becomes available. Future research could focus on diversifying the composition of datastores and optimizing retrieval strategies to further enhance performance across various NLP tasks.

PDF Markdown Bookmark Chat (Pro)

Authors (7)

Xiang Chen (343 papers)
Lei Li (1293 papers)
Ningyu Zhang (148 papers)
Chuanqi Tan (56 papers)
Fei Huang (408 papers)
Luo Si (73 papers)
Huajun Chen (198 papers)

Citations (34)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

Tweets

https://twitter.com/zxlzr/status/1524588241332019200