Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Robust Few-Shot Named Entity Recognition with Boundary Discrimination and Correlation Purification (2312.07961v1)

Published 13 Dec 2023 in cs.CL and cs.AI

Abstract: Few-shot named entity recognition (NER) aims to recognize novel named entities in low-resource domains utilizing existing knowledge. However, the present few-shot NER models assume that the labeled data are all clean without noise or outliers, and there are few works focusing on the robustness of the cross-domain transfer learning ability to textual adversarial attacks in Few-shot NER. In this work, we comprehensively explore and assess the robustness of few-shot NER models under textual adversarial attack scenario, and found the vulnerability of existing few-shot NER models. Furthermore, we propose a robust two-stage few-shot NER method with Boundary Discrimination and Correlation Purification (BDCP). Specifically, in the span detection stage, the entity boundary discriminative module is introduced to provide a highly distinguishing boundary representation space to detect entity spans. In the entity typing stage, the correlations between entities and contexts are purified by minimizing the interference information and facilitating correlation generalization to alleviate the perturbations caused by textual adversarial attacks. In addition, we construct adversarial examples for few-shot NER based on public datasets Few-NERD and Cross-Dataset. Comprehensive evaluations on those two groups of few-shot NER datasets containing adversarial examples demonstrate the robustness and superiority of the proposed method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Mixture-based Feature Space Learning for Few-shot Image Classification. In ICCV, 9021–9031. IEEE.
  2. Interpretability Analysis for Named Entity Recognition to Understand System Predictions and How They Can Improve. Comput. Linguistics, 47(1): 117–140.
  3. CONTaiNER: Few-Shot Named Entity Recognition via Contrastive Learning. In ACL, 6338–6353. Association for Computational Linguistics.
  4. ArcFace: Additive Angular Margin Loss for Deep Face Recognition. In CVPR, 4690–4699. Computer Vision Foundation / IEEE.
  5. Results of the WNUT2017 Shared Task on Novel and Emerging Entity Recognition. In NUT@EMNLP, 140–147. Association for Computational Linguistics.
  6. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT, 4171–4186.
  7. Few-NERD: A Few-shot Named Entity Recognition Dataset. In ACL/IJCNLP, 3198–3213. Association for Computational Linguistics.
  8. MANNER: A Variational Memory-Augmented Model for Cross Domain Few-Shot Named Entity Recognition. In ACL, 4261–4276. Association for Computational Linguistics.
  9. Learning Robust Representations via Multi-View Information Bottleneck. In ICLR. OpenReview.net.
  10. Few-shot classification in named entity recognition task. In SAC, 993–1000. ACM.
  11. Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network. In ACL, 1381–1393. Association for Computational Linguistics.
  12. Adversarial Examples Are Not Bugs, They Are Features. In NeurIPS, 125–136.
  13. Embeddings of Label Components for Sequence Labeling: A Case Study of Fine-grained Named Entity Recognition. In ACL (student), 222–229. Association for Computational Linguistics.
  14. Auto-Encoding Variational Bayes. In ICLR.
  15. One-shot learning by inverting a compositional causal process. In NIPS, 2526–2534.
  16. Neural Architectures for Named Entity Recognition. In HLT-NAACL, 260–270. The Association for Computational Linguistics.
  17. Few-Shot Named Entity Recognition via Meta-Learning. IEEE Trans. Knowl. Data Eng., 34(9): 4245–4256.
  18. Unified Named Entity Recognition as Word-Word Relation Classification. In AAAI, 10965–10973. AAAI Press.
  19. BERT-ATTACK: Adversarial Attack Against BERT Using BERT. In EMNLP, 6193–6202. Association for Computational Linguistics.
  20. Searching for an Effective Defender: Benchmarking Defense against Adversarial Word Substitution. In EMNLP, 3137–3147. Association for Computational Linguistics.
  21. RockNER: A Simple Method to Create Adversarial Examples for Evaluating the Robustness of Named Entity Recognition Models. In EMNLP, 3728–3737. Association for Computational Linguistics.
  22. SSPAttack: A Simple and Sweet Paradigm for Black-Box Hard-Label Textual Adversarial Attack. In AAAI, 13228–13235. AAAI Press.
  23. Flooding-X: Improving BERT’s Resistance to Adversarial Attacks via Loss-Restricted Fine-Tuning. In ACL, 5634–5644. Association for Computational Linguistics.
  24. Decoupled Weight Decay Regularization. In ICLR.
  25. Robust Few-Shot Learning for User-Provided Data. IEEE Trans. Neural Networks Learn. Syst., 32(4): 1433–1447.
  26. Coarse-to-fine Few-shot Learning for Named Entity Recognition. In ACL (Findings), 4115–4129. Association for Computational Linguistics.
  27. Decomposed Meta-Learning for Few-Shot Named Entity Recognition. In ACL (Findings), 1584–1596. Association for Computational Linguistics.
  28. The Dual Information Bottleneck. CoRR, abs/2006.04641.
  29. Towards Robust Linguistic Analysis using OntoNotes. In CoNLL, 143–152. ACL.
  30. Sang, E. F. T. K. 2002. Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition. In CoNLL. ACL.
  31. Opening the Black Box of Deep Neural Networks via Information. CoRR, abs/1703.00810.
  32. Prototypical Networks for Few-shot Learning. In NIPS, 4077–4087.
  33. Data Clustering by Markovian Relaxation and the Information Bottleneck Method. In NIPS, 640–646. MIT Press.
  34. Deep learning and the information bottleneck principle. In ITW, 1–5. IEEE.
  35. Representation Learning with Contrastive Predictive Coding. CoRR, abs/1807.03748.
  36. Visualizing data using t-SNE. Journal of machine learning research, 9(11).
  37. Matching Networks for One Shot Learning. In NIPS, 3630–3638.
  38. An Enhanced Span-based Decomposition Method for Few-Shot Sequence Labeling. In NAACL-HLT, 5012–5024. Association for Computational Linguistics.
  39. Deep Multi-view Information Bottleneck. In SDM, 37–45. SIAM.
  40. MINER: Improving Out-of-Vocabulary Named Entity Recognition from an Information Theoretic Perspective. In ACL, 5590–5600. Association for Computational Linguistics.
  41. Adversarial Training with Fast Gradient Projection Method against Synonym Substitution Based Text Attacks. In AAAI, 13997–14005. AAAI Press.
  42. Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning. In EMNLP, 6365–6375. Association for Computational Linguistics.
  43. TapNet: Neural Network Augmented with Task-Adaptive Projection for Few-Shot Learning. In ICML, volume 97 of Proceedings of Machine Learning Research, 7115–7123. PMLR.
  44. Few-shot Intent Classification and Slot Filling with Retrieved Examples. In NAACL-HLT, 734–749. Association for Computational Linguistics.
  45. Zeldes, A. 2017. The GUM corpus: creating multilayer resources in the classroom. Lang. Resour. Evaluation, 51(3): 581–612.
  46. Certified Robustness to Text Adversarial Attacks by Randomized [MASK]. Comput. Linguistics, 49(2): 395–427.
  47. Learning to Discriminate Perturbations for Blocking Adversarial Attacks in Text Classification. In EMNLP/IJCNLP, 4903–4912. Association for Computational Linguistics.
  48. FreeLB: Enhanced Adversarial Training for Natural Language Understanding. In ICLR. OpenReview.net.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Xiaojun Xue (2 papers)
  2. Chunxia Zhang (24 papers)
  3. Tianxiang Xu (1 paper)
  4. Zhendong Niu (10 papers)
Citations (2)