Formulating Few-shot Fine-tuning Towards Language Model Pre-training: A Pilot Study on Named Entity Recognition (2205.11799v2)

Published 24 May 2022 in cs.CL and cs.AI

Abstract: Fine-tuning pre-trained LLMs has recently become a common practice in building NLP models for various tasks, especially few-shot tasks. We argue that under the few-shot setting, formulating fine-tuning closer to the pre-training objectives shall be able to unleash more benefits from the pre-trained LLMs. In this work, we take few-shot named entity recognition (NER) for a pilot study, where existing fine-tuning strategies are much different from pre-training. We propose a novel few-shot fine-tuning framework for NER, FFF-NER. Specifically, we introduce three new types of tokens, "is-entity", "which-type" and bracket, so we can formulate the NER fine-tuning as (masked) token prediction or generation, depending on the choice of pre-trained LLMs. In our experiments, we apply FFF-NER to fine-tune both BERT and BART for few-shot NER on several benchmark datasets and observe significant improvements over existing fine-tuning strategies, including sequence labeling, prototype meta-learning, and prompt-based approaches. We further perform a series of ablation studies, showing few-shot NER performance is strongly correlated with the similarity between fine-tuning and pre-training.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (4)

Zihan Wang (181 papers)
Kewen Zhao (3 papers)
Zilong Wang (99 papers)
Jingbo Shang (141 papers)

Citations (5)

View on Semantic Scholar

Formulating Few-shot Fine-tuning Towards Language Model Pre-training: A Pilot Study on Named Entity Recognition (2205.11799v2)

Related Papers