One Model for All Domains: Collaborative Domain-Prefix Tuning for Cross-Domain NER (2301.10410v5)

Published 25 Jan 2023 in cs.CL, cs.AI, cs.DB, cs.IR, and cs.LG

Abstract: Cross-domain NER is a challenging task to address the low-resource problem in practical scenarios. Previous typical solutions mainly obtain a NER model by pre-trained LLMs (PLMs) with data from a rich-resource domain and adapt it to the target domain. Owing to the mismatch issue among entity types in different domains, previous approaches normally tune all parameters of PLMs, ending up with an entirely new NER model for each domain. Moreover, current models only focus on leveraging knowledge in one general source domain while failing to successfully transfer knowledge from multiple sources to the target. To address these issues, we introduce Collaborative Domain-Prefix Tuning for cross-domain NER (CP-NER) based on text-to-text generative PLMs. Specifically, we present text-to-text generation grounding domain-related instructors to transfer knowledge to new domain NER tasks without structural modifications. We utilize frozen PLMs and conduct collaborative domain-prefix tuning to stimulate the potential of PLMs to handle NER tasks across various domains. Experimental results on the Cross-NER benchmark show that the proposed approach has flexible transfer ability and performs better on both one-source and multiple-source cross-domain NER tasks. Codes are available in https://github.com/zjunlp/DeepKE/tree/main/example/ner/cross.

Citations (19)

View on Semantic Scholar

Summary

The paper introduces a novel collaborative domain-prefix tuning approach that reformulates NER as a sequence-to-sequence task using T5.
Its methodology leverages dual-query domain selection for efficient transfer of domain-specific knowledge, leading to significant F1-score improvements on CrossNER benchmarks.
The approach reduces computational overhead by avoiding full model fine-tuning while maintaining flexibility for scalable, domain-agnostic NER applications.

Collaborative Domain-Prefix Tuning for Cross-Domain NER

The paper "One Model for All Domains: Collaborative Domain-Prefix Tuning for Cross-Domain NER" introduces a novel approach aimed at enhancing the performance of Named Entity Recognition (NER) systems across varying domains, especially focusing on low-resource settings. This approach is significant for addressing the challenges associated with the cross-domain NER task, such as the mismatch of entity types across different domains and the computational inefficiencies stemming from the necessity to fine-tune large pre-trained LLMs (PLMs) for each specific domain.

Methodology Overview

The proposed methodology leverages a combination of text-to-text generative PLMs, specifically using T5, and a novel domain-prefix tuning mechanism. The primary contributions of this work are centered around a technique they refer to as Collaborative Domain-Prefix Tuning.

Text-to-Text Generation: The method reformulates NER as a sequence-to-sequence generation task, grounding the model in a domain-related instructor via text input. This avoids any structural changes to the PLM while stimulating its ability to recognize named entities across various domains.
Domain-Prefix Tuning: By keeping PLMs frozen, the approach utilizes prefix-tuning to guide the PLM in adopting domain-specific knowledge. These prefixes act as domain controllers that modify the self-attention mechanism within the transformer layers to adapt the model to different domain requirements dynamically.
Collaborative Knowledge Transfer: The methodology introduces an innovative way to transfer learned knowledge from multiple source domains to a target domain. This involves a dual-query domain selection process that assesses label and prefix similarity to determine which source domains could enhance the target domain NER performance. The synthesis of prefix knowledge from various sources is then executed through an intrinsic decomposition method.

Experimental Results

The experimental results presented in the paper demonstrate superior performance on the CrossNER benchmark compared to existing state-of-the-art methods. Notably, the model shows improved efficacy in transferring knowledge when multiple source domains are used, as evidenced by F1-score improvements across several target domains.

Flexibility: The approach maintains flexibility in knowledge transfer from both one-source and multiple-source domains, showcasing its adaptability to diverse application scenarios.
Efficiency: The prefix-tuning requires fewer computational resources than full model fine-tuning, addressing a significant practical concern in deploying NER systems across different domains.

Implications and Future Directions

The paper's findings have substantial implications for the development of domain-agnostic NER systems. By maintaining a singular model architecture across domains, the work aligns well with industrial applications where domain diversity presents a barrier to scalability.

From a theoretical perspective, the collaborative prefix tuning provides insights into optimal control mechanisms within PLMs, where the closed-loop control via prefix adjustments can effectively direct the model's outputs according to domain-specific requirements.

Future research could broaden the application scope by adapting this methodology to multilingual NER tasks and other information extraction challenges. Additionally, the concept of pluggable modules suggests potential for further advancements in creating more versatile and efficient generative models that can generalize across various knowledge domains.

One Model for All Domains: Collaborative Domain-Prefix Tuning for Cross-Domain NER (2301.10410v5)

Summary

Collaborative Domain-Prefix Tuning for Cross-Domain NER

Methodology Overview

Experimental Results

Implications and Future Directions

GitHub

YouTube

One Model for All Domains: Collaborative Domain-Prefix Tuning for Cross-Domain NER (2301.10410v5)

Summary

Collaborative Domain-Prefix Tuning for Cross-Domain NER

Methodology Overview

Experimental Results

Implications and Future Directions

Related Papers

GitHub

YouTube