What Evidence Do Language Models Find Convincing? (2402.11782v2)
Abstract: Retrieval-augmented LLMs are being increasingly tasked with subjective, contentious, and conflicting queries such as "is aspartame linked to cancer". To resolve these ambiguous queries, one must search through a large range of websites and consider "which, if any, of this evidence do I find convincing?". In this work, we study how LLMs answer this question. In particular, we construct ConflictingQA, a dataset that pairs controversial queries with a series of real-world evidence documents that contain different facts (e.g., quantitative results), argument styles (e.g., appeals to authority), and answers (Yes or No). We use this dataset to perform sensitivity and counterfactual analyses to explore which text features most affect LLM predictions. Overall, we find that current models rely heavily on the relevance of a website to the query, while largely ignoring stylistic features that humans find important such as whether a text contains scientific references or is written with a neutral tone. Taken together, these results highlight the importance of RAG corpus quality (e.g., the need to filter misinformation), and possibly even a shift in how LLMs are trained to better align with human judgements.
- Sahar Abdelnabi and Mario Fritz. 2023. Fact-Saboteurs: A taxonomy of evidence manipulation attacks against fact-verification systems. In USENIX.
- Adept. 2022. ACT-1: Transformer for actions.
- GEO: Generative engine optimization. arXiv preprint arXiv:2311.09735.
- Trusted source alignment in large language models. arXiv preprint arXiv:2311.06697.
- Language models are few-shot learners. In NeurIPS.
- Daniel Bush and Alex Zaheer. 2019. Bing’s top search results contain an alarming amount of disinformation. Internet Observatory News.
- Reading Wikipedia to answer open-domain questions. In ACL.
- Rich knowledge sources bring complex knowledge conflicts: Recalibrating models to reflect conflicting evidence. In EMNLP.
- Vicuna: An open-source chatbot impressing GPT-4 with 90%* ChatGPT quality.
- Synthetic disinformation attacks on automated fact verification systems. In AAAI.
- Ronen Eldan and Yuanzhi Li. 2023. Tinystories: How small can language models be and still speak coherent english?
- Andrew J. Flanagin and Miriam J. Metzger. 2000. Perceptions of internet information credibility. Journalism & Mass Communication Quarterly.
- How do users evaluate the credibility of web sites? A study with over 2,500 participants. In Designing for User Experiences.
- PAL: Program-aided language models. In ICML.
- Are you convinced? Choosing the more convincing evidence with a siamese network. In ACL.
- A large-scale dataset for argument quality ranking: Construction and analysis.
- Textbooks are all you need.
- Retrieval augmented language model pre-training. In ICML.
- Efficiently teaching an effective dense retriever with balanced topic aware sampling. In SIGIR.
- Robin Jia and Percy Liang. 2017. Adversarial examples for evaluating reading comprehension systems. In EMNLP.
- How do humans assess the credibility on web blogs: Qualifying and verifying human factors with machine learning. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems.
- On the subjectivity and bias of web content credibility evaluations. In WWW.
- Understanding and predicting web content credibility using the content credibility corpus. Information Processing & Management.
- Dense passage retrieval for open-domain question answering. In EMNLP.
- Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for Navy enlisted personnel. Technical report, Naval Technical Training Command Research Branch.
- Entity-based knowledge conflicts in question answering. In EMNLP.
- Social and heuristic approaches to credibility evaluation online. Journal of Communication.
- Augmented language models: A survey. In TMLR.
- AmbigQA: Answering ambiguous open-domain questions. In EMNLP.
- WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332.
- Attacking open-domain question answering by injecting misinformation. In AACL.
- Language models are unsupervised multitask learners. OpenAI blog.
- Exploring the limits of transfer learning with a unified text-to-text transformer. JMLR.
- In-context retrieval-augmented language models. TACL.
- Sudha Rao and Hal Daumé III. 2018. Learning to ask good questions: Ranking clarification questions using neural expected value of perfect information. In ACL.
- Toran Bruce Richards. 2023. AutoGPT. https://github.com/Significant-Gravitas/AutoGPT.
- Whose opinions do language models reflect? In ICML.
- Toolformer: Language models can teach themselves to use tools. arXiv preprint arXiv:2302.04761.
- A brief review on search engine optimization. In Confluence.
- REPLUG: Retrieval-augmented black-box language models. arXiv preprint arXiv:2301.12652.
- Answering ambiguous questions with a database of questions, answers, and revisions. arXiv preprint arXiv:2308.08661.
- Automatic argument quality assessment – new datasets and methods.
- LLaMA: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
- LLaMA 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- Finetuned language models are zero-shot learners. In ICLR.
- Adaptive chameleon or stubborn sloth: Unraveling the behavior of large language models in knowledge conflicts. arXiv preprint arXiv:2305.13300.
- WizardLM: Empowering large language models to follow complex instructions. arXiv preprint arXiv:2304.12244.
- Generating clarifying questions for information retrieval. The Web Conference.
- Michael JQ Zhang and Eunsol Choi. 2021. SituatedQA: Incorporating extra-linguistic contexts into QA. In EMNLP.