Context-aware Decoding Reduces Hallucination in Query-focused Summarization (2312.14335v2)
Abstract: Query-focused summarization (QFS) aims to provide a summary of a single document/multi documents that can satisfy the information needs of a given query. It is useful for various real-world applications, such as abstractive snippet generation or more recent retrieval augmented generation (RAG). A prototypical QFS pipeline consists of a retriever (sparse or dense retrieval) and a generator (usually a LLM). However, applying LLMs (LLM) potentially leads to hallucinations, especially when the evidence contradicts the prior belief of LLMs. There has been growing interest in developing new decoding methods to improve generation quality and reduce hallucination. In this work, we conduct a large-scale reproducibility study on one recently proposed decoding method -- Context-aware Decoding (CAD). In addition to replicating CAD's experiments on news summarization datasets, we include experiments on QFS datasets, and conduct more rigorous analysis on computational complexity and hyperparameter sensitivity. Experiments with eight different LLMs show that performance-wise, CAD improves QFS quality by (1) reducing factuality errors/hallucinations while (2) mostly retaining the match of lexical patterns, measured by ROUGE scores, while also at a cost of increased inference-time FLOPs and reduced decoding speed. The code implementation based on Huggingface Library is made available https://github.com/zhichaoxu-shufe/context-aware-decoding-qfs
- “Efficient index-based snippet generation” In ACM Transactions on Information Systems (TOIS) 32.2 ACM New York, NY, USA, 2014, pp. 1–24
- “On the dangers of stochastic parrots: Can language models be too big?” In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, 2021, pp. 610–623
- “Abstractive snippet generation” In Proceedings of The Web Conference 2020, 2020, pp. 1309–1319
- “Scaling instruction-finetuned language models” In arXiv preprint arXiv:2210.11416, 2022
- “Factkb: Generalizable factuality evaluation using language models enhanced with factual knowledge” In arXiv preprint arXiv:2305.08281, 2023
- “The curious case of neural text degeneration” In arXiv preprint arXiv:1904.09751, 2019
- “Survey of hallucination in natural language generation” In ACM Computing Surveys 55.12 ACM New York, NY, 2023, pp. 1–38
- “Mistral 7B” In arXiv preprint arXiv:2310.06825, 2023
- “PubMedQA: A Dataset for Biomedical Research Question Answering” In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) Hong Kong, China: Association for Computational Linguistics, 2019, pp. 2567–2577 DOI: 10.18653/v1/D19-1259
- “Scaling laws for neural language models” In arXiv preprint arXiv:2001.08361, 2020
- “Retrieval-augmented generation for knowledge-intensive nlp tasks” In Advances in Neural Information Processing Systems 33, 2020, pp. 9459–9474
- “Contrastive Decoding: Open-ended Text Generation as Optimization” In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Toronto, Canada: Association for Computational Linguistics, 2023, pp. 12286–12312 DOI: 10.18653/v1/2023.acl-long.687
- Chin-Yew Lin “ROUGE: A Package for Automatic Evaluation of Summaries” In Text Summarization Branches Out Barcelona, Spain: Association for Computational Linguistics, 2004, pp. 74–81 URL: https://aclanthology.org/W04-1013
- “DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts” In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) Online: Association for Computational Linguistics, 2021, pp. 6691–6706 DOI: 10.18653/v1/2021.acl-long.522
- “NeuroLogic Decoding: (Un)supervised Neural Text Generation with Predicate Logic Constraints” In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Online: Association for Computational Linguistics, 2021, pp. 4288–4299 DOI: 10.18653/v1/2021.naacl-main.339
- Gary Marchionini “Exploratory search: from finding to understanding” In Communications of the ACM 49.4 ACM New York, NY, USA, 2006, pp. 41–46
- “Rethinking search: making domain experts out of dilettantes” In Acm sigir forum 55.1, 2021, pp. 1–27 ACM New York, NY, USA
- Shashi Narayan, Shay B. Cohen and Mirella Lapata “Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization” In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing Brussels, Belgium: Association for Computational Linguistics, 2018, pp. 1797–1807 DOI: 10.18653/v1/D18-1206
- “Diversity driven attention model for query-based abstractive summarization” In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Vancouver, Canada: Association for Computational Linguistics, 2017, pp. 1063–1072 DOI: 10.18653/v1/P17-1098
- “Contrastive decoding improves reasoning in large language models” In arXiv preprint arXiv:2309.09117, 2023
- Liam Poel, Ryan Cotterell and Clara Meister “Mutual information alleviates hallucinations in abstractive summarization” In arXiv preprint arXiv:2210.13210, 2022
- “Exploring the limits of transfer learning with a unified text-to-text transformer” In The Journal of Machine Learning Research 21.1 JMLRORG, 2020, pp. 5485–5551
- Chirag Shah and Emily M Bender “Situating search” In Proceedings of the 2022 Conference on Human Information Interaction and Retrieval, 2022, pp. 221–232
- “Trusting Your Evidence: Hallucinate Less with Context-aware Decoding” In arXiv preprint arXiv:2305.14739, 2023
- “Retrieval Augmentation Reduces Hallucination in Conversation” In Findings of the Association for Computational Linguistics: EMNLP 2021 Punta Cana, Dominican Republic: Association for Computational Linguistics, 2021, pp. 3784–3803 DOI: 10.18653/v1/2021.findings-emnlp.320
- Abigail See, Peter J. Liu and Christopher D. Manning “Get To The Point: Summarization with Pointer-Generator Networks” In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Vancouver, Canada: Association for Computational Linguistics, 2017, pp. 1073–1083 DOI: 10.18653/v1/P17-1099
- MosaicML NLP Team “Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs” Accessed: 2023-05-05, 2023 URL: www.mosaicml.com/blog/mpt-7b
- “Fine-tuning Language Models for Factuality” In arXiv preprint arXiv:2311.08401, 2023
- “Llama: Open and efficient foundation language models” In arXiv preprint arXiv:2302.13971, 2023
- “A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation”, 2023 arXiv:2307.03987 [cs.CL]
- “A Lightweight Constrained Generation Alternative for Query-focused Summarization” In arXiv preprint arXiv:2304.11721, 2023
- “Counterfactual Editing for Search Result Explanation” In arXiv preprint arXiv:2301.10389, 2023
- “Bertscore: Evaluating text generation with bert” In arXiv preprint arXiv:1904.09675, 2019
- Zhichao Xu (30 papers)