Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

QLSC: A Query Latent Semantic Calibrator for Robust Extractive Question Answering (2404.19316v1)

Published 30 Apr 2024 in cs.CL

Abstract: Extractive Question Answering (EQA) in Machine Reading Comprehension (MRC) often faces the challenge of dealing with semantically identical but format-variant inputs. Our work introduces a novel approach, called the ``Query Latent Semantic Calibrator (QLSC)'', designed as an auxiliary module for existing MRC models. We propose a unique scaling strategy to capture latent semantic center features of queries. These features are then seamlessly integrated into traditional query and passage embeddings using an attention mechanism. By deepening the comprehension of the semantic queries-passage relationship, our approach diminishes sensitivity to variations in text format and boosts the model's capability in pinpointing accurate answers. Experimental results on robust Question-Answer datasets confirm that our approach effectively handles format-variant but semantically identical queries, highlighting the effectiveness and adaptability of our proposed method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Y. Guan, Z. Li, Z. Lin, Y. Zhu, J. Leng, and M. Guo, “Block-skim: Efficient question answering for transformer,” in Proceedings of AAAI Conference on Artificial Intelligence, vol. 36, 2022, pp. 10 710–10 719.
  2. T. Hao, X. Li, Y. He, F. L. Wang, and Y. Qu, “Recent progress in leveraging deep learning methods for question answering,” Neural Computing and Applications, pp. 1–19, 2022.
  3. A. Zeng, X. Liu, Z. Du, Z. Wang, H. Lai, M. Ding, Z. Yang, Y. Xu, W. Zheng, X. Xia, W. Tam, Z. Ma, Y. Xue, J. Zhai, W. Chen, P. Zhang, Y. Dong, and J. Tang, “Glm-130b: An open bilingual pre-trained model,” in International Conference on Learning Representations, 2022.
  4. H. Dong, J. Dong, S. Yuan, and Z. Guan, “Adversarial attack and defense on natural language processing in deep learning: A survey and perspective,” in International Conference on Machine Learning for Cyber Security, 2023, pp. 409–424.
  5. T. Le, A. T. Bui, H. Zhao, P. Montague, Q. Tran, D. Phung et al., “On global-view based defense via adversarial attack and defense risk guaranteed bounds,” in International Conference on Artificial Intelligence and Statistics, 2022, pp. 11 438–11 460.
  6. M. Bartolo, T. Thrush, R. Jia, S. Riedel, P. Stenetorp, and D. Kiela, “Improving question answering model robustness with synthetic adversarial data generation,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021, pp. 8830–8848.
  7. C. Li, X. Yang, B. Liu, W. Liu, and H. Chen, “Annealing genetic-based preposition substitution for text rubbish example generation,” in International Joint Conference on Artificial Intelligence, 2023.
  8. M. Alzantot, Y. S. Sharma, A. Elgohary, B.-J. Ho, M. Srivastava, and K.-W. Chang, “Generating natural language adversarial examples,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018.
  9. A. Talmor and J. Berant, “Multiqa: An empirical investigation of generalization and transfer in reading comprehension,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pp. 4911–4921.
  10. D. Ziegler, S. Nix, L. Chan, T. Bauman, P. Schmidt-Nielsen, T. Lin, A. Scherlis, N. Nabeshima, B. Weinstein-Raun, D. de Haas et al., “Adversarial training for high-stakes reliability,” Advances in Neural Information Processing Systems, vol. 35, pp. 9274–9286, 2022.
  11. L. Pan, C.-W. Hang, A. Sil, and S. Potdar, “Improved text classification via contrastive adversarial training,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, 2022, pp. 11 130–11 138.
  12. M. Zhou and V. M. Patel, “Enhancing adversarial robustness for deep metric learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 15 325–15 334.
  13. R. Lin, J. Xiao, and J. Fan, “Nextvlad: An efficient neural network to aggregate frame-level features for large-scale video classification,” in European Conference on Computer Vision, 2018, pp. 206–218.
  14. S. Kazi, S. Khoja, and A. Daud, “A survey of deep learning techniques for machine reading comprehension,” Artificial Intelligence Review, vol. 56, no. Suppl 2, pp. 2509–2569, 2023.
  15. R. Baradaran, R. Ghiasi, and H. Amirkhani, “A survey on machine reading comprehension systems,” Natural Language Engineering, vol. 28, no. 6, pp. 683–732, 2022.
  16. S. Liu, X. Zhang, S. Zhang, H. Wang, and W. Zhang, “Neural machine reading comprehension: Methods and trends,” Applied Sciences, vol. 9, no. 18, p. 3698, 2019.
  17. C. Zeng, S. Li, Q. Li, J. Hu, and J. Hu, “A survey on machine reading comprehension—tasks, evaluation metrics and benchmark datasets,” Applied Sciences, vol. 10, no. 21, p. 7640, 2020.
  18. H. Tan, X. Wang, Y. Ji, R. Li, X. Li, Z. Hu, Y. Zhao, and X. Han, “Gcrc: A new challenging mrc dataset from gaokao chinese for explainable evaluation,” in Findings of the 2021 International Joint Conference on Natural Language Processing, 2021, pp. 1319–1330.
  19. R. Han, I. Hsu, J. Sun, J. Baylon, Q. Ning, D. Roth, N. Peng et al., “Ester: A machine reading comprehension dataset for event semantic relation reasoning,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021, pp. 7543–7557.
  20. D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” in International Conference on Learning Representations, 2015.
  21. M. Namazifar, A. Papangelis, G. Tür, and D. Hakkani-Tür, “Language model is all you need: Natural language understanding as question answering,” in IEEE International Conference on Acoustics, Speech and Signal Processing, 2021.   IEEE, 2021, pp. 7803–7807.
  22. B. McCann, N. S. Keskar, C. Xiong, and R. Socher, “The natural language decathlon: Multitask learning as question answering,” arXiv preprint arXiv:1806.08730, 2018.
  23. X. Li, J. Feng, Y. Meng, Q. Han, F. Wu, and J. Li, “A unified mrc framework for named entity recognition,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 5849–5859.
  24. F. Li, W. Peng, Y. Chen, Q. Wang, L. Pan, Y. Lyu, and Y. Zhu, “Event extraction as multi-turn question answering,” in Findings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020, pp. 829–838.
  25. H. Tang, H. Li, J. Liu, Y. Hong, H. Wu, and H. Wang, “Dureader_robust: A chinese dataset towards evaluating robustness and generalization of machine reading comprehension in real-world applications,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, 2021, pp. 955–963.
  26. I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” in International Conference on Learning Representations, 2015.
  27. T. Miyato, A. M. Dai, and I. Goodfellow, “Adversarial training methods for semi-supervised text classification,” in International Conference on Learning Representations, 2017.
  28. A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” in International Conference on Learning Representations, 2018.
  29. A. Shafahi, M. Najibi, M. A. Ghiasi, Z. Xu, J. Dickerson, C. Studer, L. S. Davis, G. Taylor, and T. Goldstein, “Adversarial training for free!” Advances in Neural Information Processing Systems, vol. 32, 2019.
  30. H. Jiang, P. He, W. Chen, X. Liu, J. Gao, and T. Zhao, “Smart: Robust and efficient fine-tuning for pre-trained natural language models through principled regularized optimization,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 2177–2190.
  31. D. Zhang, T. Zhang, Y. Lu, Z. Zhu, and B. Dong, “You only propagate once: Accelerating adversarial training via maximal principle,” Advances in Neural Information Processing Systems, vol. 32, 2019.
  32. C. Zhu, Y. Cheng, Z. Gan, S. Sun, T. Goldstein, and J. Liu, “Freelb: Enhanced adversarial training for natural language understanding,” in International Conference on Learning Representations, 2019.
  33. Y. Li, H. Tang, J. Qian, B. Zou, and Y. Hong, “Robustness of chinese machine reading comprehension,” Journal of Peking University, vol. 57, no. 1, pp. 16–22, 2021.
  34. Y. Cui, W. Che, T. Liu, B. Qin, and Z. Yang, “Pre-training with whole word masking for chinese bert,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 3504–3514, 2021.
  35. Y. Cui, W. Che, T. Liu, B. Qin, S. Wang, and G. Hu, “Revisiting pre-trained models for chinese natural language processing,” in Findings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020, pp. 657–668.
  36. Y. Cui, Z. Yang, and T. Liu, “Pert: pre-training bert with permuted language model,” arXiv preprint arXiv:2203.06906, 2022.
  37. P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “Squad: 100,000+ questions for machine comprehension of text,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016, pp. 2383–2392.
  38. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1, 2019, pp. 4171–4186.
  39. Z. Yang, Y. Cui, W. Che, T. Liu, S. Wang, and G. Hu, “Improving machine reading comprehension via adversarial training,” CoRR, 2019.
  40. A. Yang, Q. Wang, J. Liu, K. Liu, Y. Lyu, H. Wu, Q. She, and S. Li, “Enhancing pre-trained language representations with rich knowledge for machine reading comprehension,” in Annual Meeting of the Association for Computational Linguistics, 2019.
  41. Z. Yang, Z. Dai, Y. Yang, J. G. Carbonell, R. Salakhutdinov, and Q. V. Le, “Xlnet: Generalized autoregressive pretraining for language understanding,” in Neural Information Processing Systems, 2019.
  42. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” arXiv preprint arXiv:1907.11692, 2019.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets