Investigating Information Inconsistency in Multilingual Open-Domain Question Answering (2205.12456v1)
Abstract: Retrieval based open-domain QA systems use retrieved documents and answer-span selection over retrieved documents to find best-answer candidates. We hypothesize that multilingual Question Answering (QA) systems are prone to information inconsistency when it comes to documents written in different languages, because these documents tend to provide a model with varying information about the same topic. To understand the effects of the biased availability of information and cultural influence, we analyze the behavior of multilingual open-domain question answering models with a focus on retrieval bias. We analyze if different retriever models present different passages given the same question in different languages on TyDi QA and XOR-TyDi QA, two multilingualQA datasets. We speculate that the content differences in documents across languages might reflect cultural divergences and/or social biases.
- Shramay Palta (5 papers)
- Haozhe An (13 papers)
- Yifan Yang (578 papers)
- Shuaiyi Huang (12 papers)
- Maharshi Gor (9 papers)