FactMix: Using a Few Labeled In-domain Examples to Generalize to Cross-domain Named Entity Recognition (2208.11464v3)
Abstract: Few-shot Named Entity Recognition (NER) is imperative for entity tagging in limited resource domains and thus received proper attention in recent years. Existing approaches for few-shot NER are evaluated mainly under in-domain settings. In contrast, little is known about how these inherently faithful models perform in cross-domain NER using a few labeled in-domain examples. This paper proposes a two-step rationale-centric data augmentation method to improve the model's generalization ability. Results on several datasets show that our model-agnostic method significantly improves the performance of cross-domain NER tasks compared to previous state-of-the-art methods, including the data augmentation and prompt-tuning methods. Our codes are available at https://github.com/lifan-yuan/FactMix.
- Self-supervised meta-learning for few-shot natural language classification tasks. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 522–534.
- Jonathan Baxter. 2000. A model of inductive bias learning. Journal of artificial intelligence research, 12:149–198.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
- Few-shot named entity recognition with self-describing networks. arXiv preprint arXiv:2203.12252.
- Data augmentation for cross-domain named entity recognition. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5346–5356.
- Template-based named entity recognition using BART. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1835–1845, Online. Association for Computational Linguistics.
- Leyang Cui and Yue Zhang. 2019. Hierarchically-refined label attention network for sequence labeling. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4115–4128, Hong Kong, China. Association for Computational Linguistics.
- Xiang Dai and Heike Adel. 2020. An analysis of simple data augmentation for named entity recognition. arXiv preprint arXiv:2010.11683.
- Container: Few-shot named entity recognition via contrastive learning. arXiv preprint arXiv:2109.07589.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
- A survey of data augmentation approaches for nlp. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 968–988.
- Few-shot named entity recognition: An empirical baseline study. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10408–10423, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cross-domain ner using cross-domain language modeling. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2464–2474.
- Chen Jia and Yue Zhang. 2020. Multi-cell compositional lstm for ner domain adaptation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5906–5917.
- Learning the difference that makes a difference with counterfactually-augmented data. arXiv preprint arXiv:1909.12434.
- Explaining the efficacy of counterfactually-augmented data. ArXiv, abs/2010.02114.
- Good examples make a faster learner: Simple demonstration-based learning for low-resource ner. arXiv preprint arXiv:2110.08454.
- Transfer learning for named-entity recognition with neural networks. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA).
- Data augmentation approaches in natural language processing: A survey. AI Open.
- On compositional generalization of neural machine translation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4767–4780.
- Rockner: A simple method to create adversarial examples for evaluating the robustness of named entity recognition models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3728–3737.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
- Crossner: Evaluating cross-domain named entity recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 13452–13460.
- Ilya Loshchilov and Frank Hutter. 2018. Decoupled weight decay regularization. In International Conference on Learning Representations (ICLR).
- A rationale-centric framework for human-in-the-loop machine learning. arXiv preprint arXiv:2203.12918.
- Template-free prompt tuning for few-shot NER. CoRR, abs/2109.13532.
- Template-free prompt tuning for few-shot ner. arXiv preprint arXiv:2109.13532.
- Xuezhe Ma and Eduard Hovy. 2016. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1064–1074, Berlin, Germany. Association for Computational Linguistics.
- David Nadeau and Satoshi Sekine. 2007. A survey of named entity recognition and classification. Lingvisticae Investigationes, 30(1):3–26.
- Transfer joint embedding for cross-domain named entity recognition. ACM Transactions on Information Systems (TOIS), 31(2):1–27.
- Handling noisy labels for robustly learning from self-training data for low-resource sequence labeling. NAACL HLT 2019, page 29.
- Sachin Ravi and Hugo Larochelle. 2016. Optimization as a model for few-shot learning. In ICLR.
- Erik Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the conll-2003 shared task: Language-independent named entity recognition. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pages 142–147.
- Towards out-of-distribution generalization: A survey. ArXiv, abs/2108.13624.
- Prototypical networks for few-shot learning. Advances in neural information processing systems, 30.
- Adaptive self-training for few-shot neural sequence labeling. arXiv preprint arXiv:2010.03680.
- Meta self-training for few-shot neural sequence labeling. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pages 1737–1747.
- Usb: A unified semi-supervised learning benchmark. arXiv preprint arXiv:2208.07204.
- Freematch: Self-adaptive thresholding for semi-supervised learning. arXiv preprint arXiv:2205.07246.
- Sam Wiseman and Karl Stratos. 2019. Label-agnostic sequence labeling by copying nearest neighbors. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5363–5369, Florence, Italy. Association for Computational Linguistics.
- Generating data to mitigate spurious correlations in natural language inference datasets.
- Cross-domain and semisupervised named entity recognition in chinese social media: A unified model. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26(11):2142–2152.
- LUKE: Deep contextualized entity representations with entity-aware self-attention. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6442–6454, Online. Association for Computational Linguistics.
- Exploring the efficacy of automatically generated counterfactuals for sentiment analysis. arXiv preprint arXiv:2106.15231.
- Simple and effective few-shot named entity recognition with structured nearest neighbor learning. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6365–6375, Online. Association for Computational Linguistics.
- Transfer learning for sequence tagging with hierarchical recurrent networks.
- Counterfactual generator: A weakly-supervised method for named entity recognition. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7270–7280, Online. Association for Computational Linguistics.
- Flexmatch: Boosting semi-supervised learning with curriculum pseudo labeling. Advances in Neural Information Processing Systems, 34:18408–18419.
- Every document owns its structure: Inductive text classification via graph neural networks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 334–339.
- Linyi Yang (52 papers)
- Lifan Yuan (22 papers)
- Leyang Cui (50 papers)
- Wenyang Gao (5 papers)
- Yue Zhang (620 papers)