Recall, Expand and Multi-Candidate Cross-Encode: Fast and Accurate Ultra-Fine Entity Typing (2212.09125v1)

Published 18 Dec 2022 in cs.CL and cs.AI

Abstract: Ultra-fine entity typing (UFET) predicts extremely free-formed types (e.g., president, politician) of a given entity mention (e.g., Joe Biden) in context. State-of-the-art (SOTA) methods use the cross-encoder (CE) based architecture. CE concatenates the mention (and its context) with each type and feeds the pairs into a pretrained LLM (PLM) to score their relevance. It brings deeper interaction between mention and types to reach better performance but has to perform N (type set size) forward passes to infer types of a single mention. CE is therefore very slow in inference when the type set is large (e.g., N = 10k for UFET). To this end, we propose to perform entity typing in a recall-expand-filter manner. The recall and expand stages prune the large type set and generate K (K is typically less than 256) most relevant type candidates for each mention. At the filter stage, we use a novel model called MCCE to concurrently encode and score these K candidates in only one forward pass to obtain the final type prediction. We investigate different variants of MCCE and extensive experiments show that MCCE under our paradigm reaches SOTA performance on ultra-fine entity typing and is thousands of times faster than the cross-encoder. We also found MCCE is very effective in fine-grained (130 types) and coarse-grained (9 types) entity typing. Our code is available at \url{https://github.com/modelscope/AdaSeq/tree/master/examples/MCCE}.

PDF Abstract

Overview of "Recall, Expand and Multi-Candidate Cross-Encode: Fast and Accurate Ultra-Fine Entity Typing"

The paper "Recall, Expand and Multi-Candidate Cross-Encode: Fast and Accurate Ultra-Fine Entity Typing" addresses the challenges associated with ultra-fine entity typing (UFET) by introducing a new paradigm termed Recall-Expand-Filter. The methodology aims to improve both the efficiency and accuracy of UFET, which involves predicting detailed entity types in a given context.

Methodology

The proposed approach is divided into three stages: Recall, Expand, and Filter.

Recall Stage: This phase utilizes a multi-label classification (MLC) model to efficiently prune a large type set, reducing potential candidates to a manageable number. The authors advocate for the MLC model over traditional methods such as BM25, demonstrating improved recall rates in their experiments.
Expand Stage: Acknowledging the MLC model’s limitations in recalling rare or unseen candidates, the paper introduces two methods to enhance candidate diversity: exact lexical matching and weak supervision from masked LLMs (MLM). This stage significantly bolsters the recall rate without extensive computational cost.
Filter Stage: To finalize the candidate selection, the paper proposes a novel model—Multi-Candidate Cross-Encoder (MCCE). Unlike traditional cross-encoders which process each type individually, MCCE concurrently encodes multiple type candidates in a single forward pass. Variants of MCCE are explored with different input formats and attention mechanisms to optimize performance.

Numerical Results and Analysis

The research presents comprehensive experiments on two UFET datasets, UFET and CFET, showing that the MCCE under the Recall-Expand-Filter framework achieves state-of-the-art performance with significantly faster inference than the previous cross-encoder approaches like LITE. Specifically, MCCE models demonstrate a drastic improvement in processing speed while maintaining or surpassing accuracy benchmarks. The inference speed of MCCE is comparable to simpler MLC models, orders of magnitude faster than conventional cross encoders.

Theoretical Implications and Future Directions

The paradigm shift introduced by the recall-expand-filter approach and the development of MCCE provides a systematic enhancement in ultra-fine entity typing tasks. The techniques employed, particularly the concurrent encoding and expanded candidate generation, offer a blueprint for efficiency improvements, potentially influencing related areas such as information retrieval and entity linking.

Future research might explore further optimization of attention mechanisms in MCCE or adaptations of the proposed approach to broader NLP tasks. Additionally, investigating how different pretrained models influence performance can extend the applicability and robustness of the methods outlined.

Conclusion

This paper's contribution lies in its innovative restructuring of the entity typing process, which not only accelerates computation but also maintains a high level of accuracy. By efficiently merging the candidate selection and scoring phases, this work sets a new standard for handling extensive type sets in ultra-fine entity typing. The combination of theoretical insight and practical application underscores the potential impact of these methods on artificial intelligence research, offering significant advancements in the field's understanding and capability.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Chengyue Jiang (11 papers)
Wenyang Hui (3 papers)
Yong Jiang (194 papers)
Xiaobin Wang (39 papers)
Pengjun Xie (85 papers)
Kewei Tu (74 papers)

Citations (4)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - modelscope/AdaSeq: AdaSeq: An All-in-One Library for Developing State-of-the-Art Sequence Understanding Models (372 stars)