NORMY: Non-Uniform History Modeling for Open Retrieval Conversational Question Answering (2402.04548v1)
Abstract: Open Retrieval Conversational Question Answering (OrConvQA) answers a question given a conversation as context and a document collection. A typical OrConvQA pipeline consists of three modules: a Retriever to retrieve relevant documents from the collection, a Reranker to rerank them given the question and the context, and a Reader to extract an answer span. The conversational turns can provide valuable context to answer the final query. State-of-the-art OrConvQA systems use the same history modeling for all three modules of the pipeline. We hypothesize this as suboptimal. Specifically, we argue that a broader context is needed in the first modules of the pipeline to not miss relevant documents, while a narrower context is needed in the last modules to identify the exact answer span. We propose NORMY, the first unsupervised non-uniform history modeling pipeline which generates the best conversational history for each module. We further propose a novel Retriever for NORMY, which employs keyphrase extraction on the conversation history, and leverages passages retrieved in previous turns as additional context. We also created a new dataset for OrConvQA, by expanding the doc2dial dataset. We implemented various state-of-the-art history modeling techniques and comprehensively evaluated them separately for each module of the pipeline on three datasets: OR-QUAC, our doc2dial extension, and ConvMix. Our extensive experiments show that NORMY outperforms the state-of-the-art in the individual modules and in the end-to-end system.
- S. Reddy, D. Chen, and C. D. Manning, “Coqa: A conversational question answering challenge,” Transactions of the Association for Computational Linguistics, vol. 7, pp. 249–266, 2019.
- E. Choi, H. He, M. Iyyer, M. Yatskar, W.-t. Yih, Y. Choi, P. Liang, and L. Zettlemoyer, “Quac: Question answering in context,” arXiv preprint arXiv:1808.07036, 2018.
- M. Zaib, W. E. Zhang, Q. Z. Sheng, A. Mahmood, and Y. Zhang, “Conversational question answering: A survey,” Knowledge and Information Systems, pp. 1–45, 2022.
- T. Kwiatkowski, J. Palomaki, O. Redfield, M. Collins, A. Parikh, C. Alberti, D. Epstein, I. Polosukhin, J. Devlin, K. Lee et al., “Natural questions: a benchmark for question answering research,” Transactions of the Association for Computational Linguistics, vol. 7, pp. 453–466, 2019.
- E. M. Voorhees, D. M. Tice et al., “The trec-8 question answering track evaluation,” in TREC, vol. 1999. Citeseer, 1999, p. 82.
- P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “Squad: 100,000+ questions for machine comprehension of text,” arXiv preprint arXiv:1606.05250, 2016.
- P. Bajaj, D. Campos, N. Craswell, L. Deng, J. Gao, X. Liu, R. Majumder, A. McNamara, B. Mitra, T. Nguyen et al., “Ms marco: A human generated machine reading comprehension dataset,” arXiv preprint arXiv:1611.09268, 2016.
- D. Cohen, L. Yang, and W. B. Croft, “Wikipassageqa: A benchmark collection for research on non-factoid answer passage retrieval,” in The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 2018, pp. 1165–1168.
- B. Dhingra, K. Mazaitis, and W. W. Cohen, “Quasar: Datasets for question answering by search and reading,” arXiv preprint arXiv:1707.03904, 2017.
- C. Qu, L. Yang, C. Chen, M. Qiu, W. B. Croft, and M. Iyyer, “Open-retrieval conversational question answering,” in Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020, pp. 539–548.
- C. Qu, L. Yang, C. Chen, W. B. Croft, K. Krishna, and M. Iyyer, “Weakly-supervised open-retrieval conversational question answering,” in European Conference on Information Retrieval. Springer, 2021, pp. 529–543.
- H.-C. Fang, K.-H. Hung, C.-W. Huang, and Y.-N. Chen, “Open-domain conversational question answering with historical answers,” arXiv preprint arXiv:2211.09401, 2022.
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
- M. Qiu, X. Huang, C. Chen, F. Ji, C. Qu, W. Wei, J. Huang, and Y. Zhang, “Reinforced history backtracking for conversational question answering,” in Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI, 2021, pp. 13 718–13 726.
- S. Vakulenko, S. Longpre, Z. Tu, and R. Anantha, “Question rewriting for conversational question answering,” in Proceedings of the 14th ACM International Conference on Web Search and Data Mining, 2021, pp. 355–363.
- I. Mele, C. I. Muntean, F. M. Nardini, R. Perego, N. Tonellotto, and O. Frieder, “Adaptive utterance rewriting for conversational search,” Information Processing & Management, vol. 58, no. 6, p. 102682, 2021.
- H. Su, X. Shen, R. Zhang, F. Sun, P. Hu, C. Niu, and J. Zhou, “Improving multi-turn dialogue modelling with utterance rewriter,” arXiv preprint arXiv:1906.07004, 2019.
- M. Joshi, E. Choi, D. S. Weld, and L. Zettlemoyer, “Triviaqa: A large scale distantly supervised challenge dataset for reading comprehension,” arXiv preprint arXiv:1705.03551, 2017.
- D. Chen, A. Fisch, J. Weston, and A. Bordes, “Reading wikipedia to answer open-domain questions,” arXiv preprint arXiv:1704.00051, 2017.
- M. Saeidi, M. Bartolo, P. Lewis, S. Singh, T. Rocktäschel, M. Sheldon, G. Bouchard, and S. Riedel, “Interpretation of natural language rules in conversational machine reading,” arXiv preprint arXiv:1809.01494, 2018.
- C. Qu, L. Yang, M. Qiu, W. B. Croft, Y. Zhang, and M. Iyyer, “Bert with history answer embedding for conversational question answering,” in Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval, 2019, pp. 1133–1136.
- S. Feng, K. Fadnis, Q. V. Liao, and L. A. Lastras, “Doc2dial: a framework for dialogue composition grounded in documents,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 09, 2020, pp. 13 604–13 605.
- P. Christmann, R. S. Roy, and G. Weikum, “Conversational question answering on heterogeneous sources,” arXiv preprint arXiv:2204.11677, 2022.
- S. E. Robertson, S. Walker, S. Jones, M. M. Hancock-Beaulieu, M. Gatford et al., “Okapi at trec-3,” Nist Special Publication Sp, vol. 109, p. 109, 1995.
- K. Lee, M.-W. Chang, and K. Toutanova, “Latent retrieval for weakly supervised open domain question answering,” arXiv preprint arXiv:1906.00300, 2019.
- V. Karpukhin, B. Oğuz, S. Min, P. Lewis, L. Wu, S. Edunov, D. Chen, and W.-t. Yih, “Dense passage retrieval for open-domain question answering,” arXiv preprint arXiv:2004.04906, 2020.
- Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, “Albert: A lite bert for self-supervised learning of language representations,” arXiv preprint arXiv:1909.11942, 2019.
- R. Campos, V. Mangaravite, A. Pasquali, A. M. Jorge, C. Nunes, and A. Jatowt, “A text feature based automatic keyword extraction method for single documents,” in European conference on information retrieval. Springer, 2018, pp. 684–691.
- N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using siamese bert-networks,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 11 2019. [Online]. Available: https://arxiv.org/abs/1908.10084
- P. M. Htut, S. R. Bowman, and K. Cho, “Training a ranking function for open-domain question answering,” arXiv preprint arXiv:1804.04264, 2018.
- K. Clark and C. D. Manning, “Improving coreference resolution by learning entity-level distributed representations,” arXiv preprint arXiv:1606.01323, 2016.
- A. Elgohary, D. Peskov, and J. Boyd-Graber, “Can you unpack that? learning to rewrite questions-in-context,” Can You Unpack That? Learning to Rewrite Questions-in-Context, 2019.
- R. Anantha, S. Vakulenko, Z. Tu, S. Longpre, S. Pulman, and S. Chappidi, “Open-domain question answering goes conversational via question rewriting,” arXiv preprint arXiv:2010.04898, 2020.
- F. Boudin, “pke: an open source python-based keyphrase extraction toolkit,” in Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations, Osaka, Japan, December 2016, pp. 69–73. [Online]. Available: http://aclweb.org/anthology/C16-2015
- T. Nguyen, M. Rosenberg, X. Song, J. Gao, S. Tiwary, R. Majumder, and L. Deng, “Ms marco: A human generated machine reading comprehension dataset,” in CoCo@ NIPs, 2016.
- A. Trischler, T. Wang, X. Yuan, J. Harris, A. Sordoni, P. Bachman, and K. Suleman, “Newsqa: A machine comprehension dataset,” arXiv preprint arXiv:1611.09830, 2016.
- M. S. Rashid, F. Jamour, and V. Hristidis, “Quax: Mining the web for high-utility faq,” in Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2021, pp. 1518–1527.
- B. Kratzwald and S. Feuerriegel, “Adaptive document retrieval for deep question answering,” arXiv preprint arXiv:1808.06528, 2018.
- J. Lee, S. Yun, H. Kim, M. Ko, and J. Kang, “Ranking paragraphs for improving answer recall in open-domain question answering,” arXiv preprint arXiv:1810.00494, 2018.
- W. Yang, Y. Xie, A. Lin, X. Li, L. Tan, K. Xiong, M. Li, and J. Lin, “End-to-end open-domain question answering with bertserini,” arXiv preprint arXiv:1902.01718, 2019.
- P. Christmann, R. Saha Roy, A. Abujabal, J. Singh, and G. Weikum, “Look before you hop: Conversational question answering over knowledge graphs using judicious context expansion,” in Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019, pp. 729–738.
- J. A. Campos, A. Otegi, A. Soroa, J. Deriu, M. Cieliebak, and E. Agirre, “Doqa–accessing domain-specific faqs via conversational qa,” arXiv preprint arXiv:2005.01328, 2020.
- H.-Y. Huang, E. Choi, and W.-t. Yih, “Flowqa: Grasping flow in history for conversational machine comprehension,” arXiv preprint arXiv:1810.06683, 2018.
- Y. Chen, L. Wu, and M. J. Zaki, “Graphflow: Exploiting conversation flow with graph neural networks for conversational machine comprehension,” arXiv preprint arXiv:1908.00059, 2019.
- C. Qu, L. Yang, M. Qiu, Y. Zhang, C. Chen, W. B. Croft, and M. Iyyer, “Attentive history selection for conversational question answering,” in Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019, pp. 1391–1400.
- V. Adlakha, S. Dhuliawala, K. Suleman, H. de Vries, and S. Reddy, “Topiocqa: Open-domain conversational question answering with topic switching,” Transactions of the Association for Computational Linguistics, vol. 10, pp. 468–483, 2022.
- P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t. Yih, T. Rocktäschel et al., “Retrieval-augmented generation for knowledge-intensive nlp tasks,” Advances in Neural Information Processing Systems, vol. 33, pp. 9459–9474, 2020.
- M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, and L. Zettlemoyer, “Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension,” arXiv preprint arXiv:1910.13461, 2019.
- Muhammad Shihab Rashid (7 papers)
- Jannat Ara Meem (4 papers)
- Vagelis Hristidis (9 papers)