QuestGen: Effectiveness of Question Generation Methods for Fact-Checking Applications (2407.21441v2)
Abstract: Verifying fact-checking claims poses a significant challenge, even for humans. Recent approaches have demonstrated that decomposing claims into relevant questions to gather evidence enhances the efficiency of the fact-checking process. In this paper, we provide empirical evidence showing that this question decomposition can be effectively automated. We demonstrate that smaller generative models, fine-tuned for the question generation task using data augmentation from various datasets, outperform LLMs by up to 8%. Surprisingly, in some cases, the evidence retrieved using machine-generated questions proves to be significantly more effective for fact-checking than that obtained from human-written questions. We also perform manual evaluation of the decomposed questions to assess the quality of the questions generated.
- MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP ’19). 4685–4697.
- BRENDA: Browser Extension for Fake News Detection. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’20). 2117–2120.
- Canyu Chen and Kai Shu. 2023. Combating misinformation in the age of llms: Opportunities and challenges. arXiv preprint arXiv:2311.05656 (2023).
- Generating Literal and Implied Subquestions to Fact-check Complex Claims. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP ’22). 3495–3516.
- FacTool: Factuality Detection in Generative AI – A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios. arXiv:2307.13528 [cs]
- Generating Fact Checking Briefs. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (EMNLP ’20’). 7147–7161.
- RARR: Researching and Revising What Language Models Say, Using Language Models. arXiv:2210.08726 [cs]
- Toward automated fact-checking: Detecting check-worthy factual claims by claimbuster. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 1803–1812.
- The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751 (2019).
- Bad actor, good advisor: Exploring the role of large language models in fake news detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 22105–22113.
- ProoFVer: Natural Logic Theorem Proving for Fact Verification. Transactions of the Association for Computational Linguistics 10 (2022), 1013–1030.
- Natural Questions: A Benchmark for Question Answering Research. Transactions of the Association for Computational Linguistics 7 (2019), 452–466.
- Varifocal Question Generation for Fact-checking. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 2532–2544.
- Training language models to follow instructions with human feedback. Advances in neural information processing systems 35 (2022), 27730–27744.
- Fact-Checking Complex Claims with Program-Guided Reasoning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (ACL ’23). 6981–7004.
- FaVIQ: FAct Verification from Information-seeking Questions. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (ACL ’22). 5154–5166.
- DeClarE: Debunking fake news and false claims using evidence-aware deep learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP ’18). 22–32.
- AVERITEC: a dataset for real-world claim verification with evidence from the web. In Proceedings of the 37th International Conference on Neural Information Processing Systems (NeurIPS ’23’). 65128–65167.
- Vinay Setty. 2024a. FactCheck Editor: Multilingual Text Editor with End-to-End fact-checking. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (Washington DC, USA) (SIGIR ’24). 2744–2748.
- Vinay Setty. 2024b. Surprising Efficacy of Fine-Tuned Transformers for Fact-Checking over Larger Language Models. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’24). 2842–2846.
- QuanTemp: A real-world open-domain benchmark for fact-checking numerical claims. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’24). 650–660.
- Explainable fact-checking through question answering. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 8952–8956.
- Vinay Setty (22 papers)
- Ritvik Setty (1 paper)