Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
143 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-BERT: Leveraging Adapters and Prompt Tuning for Low-Resource Multi-Domain Adaptation (2404.02335v1)

Published 2 Apr 2024 in cs.CL and cs.AI

Abstract: The rapid expansion of texts' volume and diversity presents formidable challenges in multi-domain settings. These challenges are also visible in the Persian name entity recognition (NER) settings. Traditional approaches, either employing a unified model for multiple domains or individual models for each domain, frequently pose significant limitations. Single models often struggle to capture the nuances of diverse domains, while utilizing multiple large models can lead to resource constraints, rendering the training of a model for each domain virtually impractical. Therefore, this paper introduces a novel approach composed of one core model with multiple sets of domain-specific parameters. We utilize techniques such as prompt tuning and adapters, combined with the incorporation of additional layers, to add parameters that we can train for the specific domains. This enables the model to perform comparably to individual models for each domain. Experimental results on different formal and informal datasets show that by employing these added parameters, the proposed model significantly surpasses existing practical models in performance. Remarkably, the proposed model requires only one instance for training and storage, yet achieves outstanding results across all domains, even surpassing the state-of-the-art in some. Moreover, we analyze each adaptation strategy, delineating its strengths, weaknesses, and optimal hyper-parameters for the Persian NER settings. Finally, we introduce a document-based domain detection pipeline tailored for scenarios with unknown text domains, enhancing the adaptability and practicality of this paper in real-world applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (13)
  1. ParsTwiNER: A corpus for named entity recognition at informal persian. In Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021), pp.  131–136, 2021.
  2. Majid Asgari. Parsner, 2021. https://github.com/majidasgari/ParsNER.
  3. Entity recognition in the biomedical domain using a hybrid approach. Journal of biomedical semantics, 8(1):1–14, 2017.
  4. ParsBert: Transformer-based model for Persian language understanding. Neural Processing Letters, 53:3831–3847, 2021.
  5. Template-free prompting for few-shot named entity recognition via semantic-enhanced contrastive learning. IEEE Transactions on Neural Networks and Learning Systems, pp.  1–13, 2023.
  6. On the effectiveness of adapter-based tuning for pretrained language model adaptation. arXiv preprint arXiv:2106.03164, 2021.
  7. LoRA: Low-Rank adaptation of large language models. In International Conference on Learning Representations, 2022.
  8. Clinical named entity recognition: Challenges and opportunities. In Proceedings of IEEE International Conference on Big Data, pp.  1937–1945, 2016.
  9. Prefix-tuning: Optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190, 2021.
  10. PUnifiedNER: A prompting-based unified ner system for diverse datasets. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pp.  13327–13335, 2023.
  11. Template-free prompt tuning for few-shot NER. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.  5721–5732, July 2022.
  12. Beheshti-NER: Persian named entity recognition using BERT. arXiv preprint arXiv:2003.08875, 2020.
  13. InstructionNER: A multi-task instruction-based generative framework for few-shot ner. arXiv preprint arXiv:2203.03903, 2022.

Summary

We haven't generated a summary for this paper yet.