Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Noise-BERT: A Unified Perturbation-Robust Framework with Noise Alignment Pre-training for Noisy Slot Filling Task (2402.14494v3)

Published 22 Feb 2024 in cs.CL

Abstract: In a realistic dialogue system, the input information from users is often subject to various types of input perturbations, which affects the slot-filling task. Although rule-based data augmentation methods have achieved satisfactory results, they fail to exhibit the desired generalization when faced with unknown noise disturbances. In this study, we address the challenges posed by input perturbations in slot filling by proposing Noise-BERT, a unified Perturbation-Robust Framework with Noise Alignment Pre-training. Our framework incorporates two Noise Alignment Pre-training tasks: Slot Masked Prediction and Sentence Noisiness Discrimination, aiming to guide the pre-trained LLM in capturing accurate slot information and noise distribution. During fine-tuning, we employ a contrastive learning loss to enhance the semantic representation of entities and labels. Additionally, we introduce an adversarial attack training strategy to improve the model's robustness. Experimental results demonstrate the superiority of our proposed approach over state-of-the-art models, and further analysis confirms its effectiveness and generalization ability.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. “Joint online spoken language understanding and language modeling with recurrent neural networks,” in Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Los Angeles, Sept. 2016, pp. 22–30, Association for Computational Linguistics.
  2. “Adversarial semantic decoupling for recognizing open-vocabulary slots,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, Nov. 2020, pp. 6070–6075, Association for Computational Linguistics.
  3. “Bridge to target domain by prototypical contrastive learning and label confusion: Re-explore zero-shot learning for slot filling,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, Dominican Republic, Nov. 2021, pp. 9474–9480, Association for Computational Linguistics.
  4. “Towards robust and generalizable training: An empirical study of noisy slot filling for input perturbations,” 2023.
  5. “Generative zero-shot prompt learning for cross-domain slot filling with inverse prompting,” 2023.
  6. “Bridging the gap between clean data training and real-world inference for spoken language understanding,” arXiv preprint arXiv:2104.06393, 2021.
  7. “Evaluating the robustness of neural language models to input perturbations,” arXiv preprint arXiv:2108.12237, 2021.
  8. “Textflint: Unified multilingual robustness evaluation toolkit for natural language processing,” 2021.
  9. “Pssat: A perturbed semantic structure awareness transferring method for perturbation-robust slot filling,” arXiv preprint arXiv:2208.11508, 2022.
  10. “Revisit out-of-vocabulary problem for slot filling: A unified contrastive framework with multi-level data augmentations,” in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023, pp. 1–5.
  11. Edward Ma, “Nlp augmentation,” https://github.com/makcedward/nlpaug, 2019.
  12. “Contrastive representation distillation,” arXiv preprint arXiv:1910.10699, 2019.
  13. “Supervised contrastive learning,” Advances in neural information processing systems, vol. 33, pp. 18661–18673, 2020.
  14. “Bert: Pre-training of deep bidirectional transformers for language understanding,” 2019.
  15. “Adversarial training methods for semi-supervised text classification,” arXiv preprint arXiv:1605.07725, 2016.
  16. “Open intent extraction from natural language interactions,” in Proceedings of The Web Conference 2020, 2020, pp. 2009–2020.
  17. “Raddle: An evaluation benchmark and analysis platform for robust task-oriented dialog systems,” arXiv preprint arXiv:2012.14666, 2020.
  18. “Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces,” 2018.
  19. Laurens Van der Maaten and Geoffrey Hinton, “Visualizing data using t-sne.,” Journal of machine learning research, vol. 9, no. 11, 2008.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Jinxu Zhao (5 papers)
  2. Guanting Dong (46 papers)
  3. Yueyan Qiu (3 papers)
  4. Tingfeng Hui (10 papers)
  5. Xiaoshuai Song (16 papers)
  6. Daichi Guo (8 papers)
  7. Weiran Xu (58 papers)