Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Context-aware Decoding Reduces Hallucination in Query-focused Summarization (2312.14335v2)

Published 21 Dec 2023 in cs.CL and cs.IR

Abstract: Query-focused summarization (QFS) aims to provide a summary of a single document/multi documents that can satisfy the information needs of a given query. It is useful for various real-world applications, such as abstractive snippet generation or more recent retrieval augmented generation (RAG). A prototypical QFS pipeline consists of a retriever (sparse or dense retrieval) and a generator (usually a LLM). However, applying LLMs (LLM) potentially leads to hallucinations, especially when the evidence contradicts the prior belief of LLMs. There has been growing interest in developing new decoding methods to improve generation quality and reduce hallucination. In this work, we conduct a large-scale reproducibility study on one recently proposed decoding method -- Context-aware Decoding (CAD). In addition to replicating CAD's experiments on news summarization datasets, we include experiments on QFS datasets, and conduct more rigorous analysis on computational complexity and hyperparameter sensitivity. Experiments with eight different LLMs show that performance-wise, CAD improves QFS quality by (1) reducing factuality errors/hallucinations while (2) mostly retaining the match of lexical patterns, measured by ROUGE scores, while also at a cost of increased inference-time FLOPs and reduced decoding speed. The code implementation based on Huggingface Library is made available https://github.com/zhichaoxu-shufe/context-aware-decoding-qfs

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. “Efficient index-based snippet generation” In ACM Transactions on Information Systems (TOIS) 32.2 ACM New York, NY, USA, 2014, pp. 1–24
  2. “On the dangers of stochastic parrots: Can language models be too big?” In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, 2021, pp. 610–623
  3. “Abstractive snippet generation” In Proceedings of The Web Conference 2020, 2020, pp. 1309–1319
  4. “Scaling instruction-finetuned language models” In arXiv preprint arXiv:2210.11416, 2022
  5. “Factkb: Generalizable factuality evaluation using language models enhanced with factual knowledge” In arXiv preprint arXiv:2305.08281, 2023
  6. “The curious case of neural text degeneration” In arXiv preprint arXiv:1904.09751, 2019
  7. “Survey of hallucination in natural language generation” In ACM Computing Surveys 55.12 ACM New York, NY, 2023, pp. 1–38
  8. “Mistral 7B” In arXiv preprint arXiv:2310.06825, 2023
  9. “PubMedQA: A Dataset for Biomedical Research Question Answering” In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) Hong Kong, China: Association for Computational Linguistics, 2019, pp. 2567–2577 DOI: 10.18653/v1/D19-1259
  10. “Scaling laws for neural language models” In arXiv preprint arXiv:2001.08361, 2020
  11. “Retrieval-augmented generation for knowledge-intensive nlp tasks” In Advances in Neural Information Processing Systems 33, 2020, pp. 9459–9474
  12. “Contrastive Decoding: Open-ended Text Generation as Optimization” In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Toronto, Canada: Association for Computational Linguistics, 2023, pp. 12286–12312 DOI: 10.18653/v1/2023.acl-long.687
  13. Chin-Yew Lin “ROUGE: A Package for Automatic Evaluation of Summaries” In Text Summarization Branches Out Barcelona, Spain: Association for Computational Linguistics, 2004, pp. 74–81 URL: https://aclanthology.org/W04-1013
  14. “DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts” In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) Online: Association for Computational Linguistics, 2021, pp. 6691–6706 DOI: 10.18653/v1/2021.acl-long.522
  15. “NeuroLogic Decoding: (Un)supervised Neural Text Generation with Predicate Logic Constraints” In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Online: Association for Computational Linguistics, 2021, pp. 4288–4299 DOI: 10.18653/v1/2021.naacl-main.339
  16. Gary Marchionini “Exploratory search: from finding to understanding” In Communications of the ACM 49.4 ACM New York, NY, USA, 2006, pp. 41–46
  17. “Rethinking search: making domain experts out of dilettantes” In Acm sigir forum 55.1, 2021, pp. 1–27 ACM New York, NY, USA
  18. Shashi Narayan, Shay B. Cohen and Mirella Lapata “Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization” In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing Brussels, Belgium: Association for Computational Linguistics, 2018, pp. 1797–1807 DOI: 10.18653/v1/D18-1206
  19. “Diversity driven attention model for query-based abstractive summarization” In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Vancouver, Canada: Association for Computational Linguistics, 2017, pp. 1063–1072 DOI: 10.18653/v1/P17-1098
  20. “Contrastive decoding improves reasoning in large language models” In arXiv preprint arXiv:2309.09117, 2023
  21. Liam Poel, Ryan Cotterell and Clara Meister “Mutual information alleviates hallucinations in abstractive summarization” In arXiv preprint arXiv:2210.13210, 2022
  22. “Exploring the limits of transfer learning with a unified text-to-text transformer” In The Journal of Machine Learning Research 21.1 JMLRORG, 2020, pp. 5485–5551
  23. Chirag Shah and Emily M Bender “Situating search” In Proceedings of the 2022 Conference on Human Information Interaction and Retrieval, 2022, pp. 221–232
  24. “Trusting Your Evidence: Hallucinate Less with Context-aware Decoding” In arXiv preprint arXiv:2305.14739, 2023
  25. “Retrieval Augmentation Reduces Hallucination in Conversation” In Findings of the Association for Computational Linguistics: EMNLP 2021 Punta Cana, Dominican Republic: Association for Computational Linguistics, 2021, pp. 3784–3803 DOI: 10.18653/v1/2021.findings-emnlp.320
  26. Abigail See, Peter J. Liu and Christopher D. Manning “Get To The Point: Summarization with Pointer-Generator Networks” In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Vancouver, Canada: Association for Computational Linguistics, 2017, pp. 1073–1083 DOI: 10.18653/v1/P17-1099
  27. MosaicML NLP Team “Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs” Accessed: 2023-05-05, 2023 URL: www.mosaicml.com/blog/mpt-7b
  28. “Fine-tuning Language Models for Factuality” In arXiv preprint arXiv:2311.08401, 2023
  29. “Llama: Open and efficient foundation language models” In arXiv preprint arXiv:2302.13971, 2023
  30. “A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation”, 2023 arXiv:2307.03987 [cs.CL]
  31. “A Lightweight Constrained Generation Alternative for Query-focused Summarization” In arXiv preprint arXiv:2304.11721, 2023
  32. “Counterfactual Editing for Search Result Explanation” In arXiv preprint arXiv:2301.10389, 2023
  33. “Bertscore: Evaluating text generation with bert” In arXiv preprint arXiv:1904.09675, 2019
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Zhichao Xu (30 papers)
Citations (7)
Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com