Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fakes of Varying Shades: How Warning Affects Human Perception and Engagement Regarding LLM Hallucinations (2404.03745v3)

Published 4 Apr 2024 in cs.HC, cs.AI, and cs.CL

Abstract: The widespread adoption and transformative effects of LLMs have sparked concerns regarding their capacity to produce inaccurate and fictitious content, referred to as `hallucinations'. Given the potential risks associated with hallucinations, humans should be able to identify them. This research aims to understand the human perception of LLM hallucinations by systematically varying the degree of hallucination (genuine, minor hallucination, major hallucination) and examining its interaction with warning (i.e., a warning of potential inaccuracies: absent vs. present). Participants (N=419) from Prolific rated the perceived accuracy and engaged with content (e.g., like, dislike, share) in a Q/A format. Participants ranked content as truthful in the order of genuine, minor hallucination, and major hallucination, and user engagement behaviors mirrored this pattern. More importantly, we observed that warning improved the detection of hallucination without significantly affecting the perceived truthfulness of genuine content. We conclude by offering insights for future tools to aid human detection of hallucinations. All survey materials, demographic questions, and post-session questions are available at: https://github.com/MahjabinNahar/fakes-of-varying-shades-survey-materials

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. Forbes 2023. Lawyer used chatgpt in court—and cited fake cases. a judge is considering sanctions, a.
  2. Reuters 2023. Exclusive: Chatgpt traffic slips again for third month in a row, b.
  3. Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pp.  1877–1901. Curran Associates, Inc., 2020.
  4. Can llm-generated misinformation be detected? arXiv preprint arXiv:2309.13788, 2023.
  5. Hallucination detection: Robustly discerning reliable answers in large language models. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, CIKM ’23, pp.  245–255. Association for Computing Machinery, 2023.
  6. All that’s ‘human’ is not gold: Evaluating human evaluation of generated text. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp.  7282–7296. Association for Computational Linguistics, August 2021.
  7. Diving deep into modes of fact hallucinations in dialogue systems. In Findings of the Association for Computational Linguistics: EMNLP 2022, pp.  684–699. Association for Computational Linguistics, December 2022.
  8. Handling divergent reference texts when evaluating table-to-text generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp.  4884–4895. Association for Computational Linguistics, July 2019.
  9. Enabling language models to fill in the blanks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp.  2492–2501, Online, July 2020. Association for Computational Linguistics.
  10. Data quality in online human-subjects research: Comparisons between mturk, prolific, cloudresearch, qualtrics, and sona. Plos one, 18(3):e0279720, 2023.
  11. Faithdial: A faithful benchmark for information-seeking dialogue. Transactions of the Association for Computational Linguistics, 10:1473–1490, 2022a.
  12. Evaluating attribution in dialogue systems: The BEGIN benchmark. Transactions of the Association for Computational Linguistics, 10:1066–1083, 2022b.
  13. Explicit warnings reduce but do not eliminate the continued influence of misinformation. Memory & cognition, 38:1087–1100, 2010.
  14. Statistical power analyses using g* power 3.1: Tests for correlation and regression analyses. Behavior research methods, 41(4):1149–1160, 2009.
  15. James H Fetzer. Information: Does it have to be true? Minds and Machines, 14:223–229, 2004.
  16. Less than you think: Prevalence and predictors of fake news dissemination on facebook. Science advances, 5(1):eaau4586, 2019.
  17. Automatic detection of generated text is easiest when humans are fooled. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp.  1808–1822. Association for Computational Linguistics, July 2020.
  18. Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12), mar 2023.
  19. All the news that’s fit to fabricate: Ai-generated text as a tool of media misinformation. Journal of Experimental Political Science, 9(1):104–117, 2022.
  20. Daniël Lakens. Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and anovas. Frontiers in psychology, 4:62627, 2013.
  21. Misinformation, disinformation, and violent conflict: From iraq and the “war on terror” to future threats to peace. American psychologist, 68(7):487, 2013.
  22. HaluEval: A large-scale hallucination evaluation benchmark for large language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp.  6449–6464. Association for Computational Linguistics, December 2023.
  23. The dawn after the dark: An empirical study on factuality hallucination in large language models. arXiv preprint arXiv:2401.03205, 2024.
  24. TruthfulQA: Measuring how models mimic human falsehoods. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.  3214–3252. Association for Computational Linguistics, May 2022.
  25. A token-level reference-free hallucination detection benchmark for free-form text generation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.  6723–6737. Association for Computational Linguistics, May 2022.
  26. Let’s hate together: How people share news in messaging, social, and public networks. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pp.  1–13, 2018.
  27. Fighting fire with fire: The dual role of LLMs in crafting and detecting elusive disinformation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp.  14279–14305. Association for Computational Linguistics, December 2023.
  28. Misinformation warning labels are widely effective: A review of warning effects and their moderating features. Current Opinion in Psychology, pp.  101710, 2023.
  29. Accuracy prompts are a replicable and generalizable approach for reducing the spread of misinformation. Nature communications, 13(1):2333, 2022.
  30. Prior exposure increases perceived accuracy of fake news. Journal of experimental psychology: general, 147(12):1865, 2018.
  31. Fighting covid-19 misinformation on social media: Experimental evidence for a scalable accuracy-nudge intervention. Psychological science, 31(7):770–780, 2020.
  32. Shifting attention to accuracy can reduce misinformation online. Nature, 592(7855):590–595, 2021.
  33. Detecting and mitigating hallucinations in multilingual summarisation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp.  8914–8932. Association for Computational Linguistics, December 2023.
  34. The troubling emergence of hallucination in large language models - an extensive definition, quantification, and prescriptive remediations. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp.  2541–2573. Association for Computational Linguistics, December 2023.
  35. Rome was built in 1776: A case study on factual correctness in knowledge-grounded response generation. arXiv preprint arXiv:2110.05456, 2021.
  36. Cataloging prompt patterns to enhance the discipline of prompt engineering. URL: https://www. dre. vanderbilt. edu/~ schmidt/PDF/ADA_Europe_Position_Paper. pdf [accessed 2023-11-25], 2023.
  37. QuestEval: Summarization asks for fact-based evaluation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp.  6594–6604. Association for Computational Linguistics, November 2021.
  38. Trust it or not: Effects of machine-learning warnings in helping individuals mitigate misinformation. In Proceedings of the 10th ACM Conference on Web Science, pp.  265–274, 2019.
  39. If you have a reliable source, say something: effects of correction comments on covid-19 misinformation. In Proceedings of the international AAAI conference on web and social media, volume 16, pp.  896–907, 2022.
  40. “why is this misleading?”: Detecting news headline hallucinations with explanations. In Proceedings of the ACM Web Conference 2023, WWW ’23, pp.  1662–1672. Association for Computing Machinery, 2023.
  41. Retrieval augmentation reduces hallucination in conversation. In Findings of the Association for Computational Linguistics: EMNLP 2021, pp.  3784–3803. Association for Computational Linguistics, November 2021.
  42. Ai model gpt-3 (dis)informs us better than humans. Science Advances, 9(26), 2023.
  43. Yanli Sun. Mining the correlation between human and automatic evaluation at sentence level. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC), 2010.
  44. Facebook sentiment: Reactions and emojis. In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media, pp.  11–16. Association for Computational Linguistics, April 2017.
  45. TURINGBENCH: A benchmark environment for Turing test in the age of neural text generation. In Findings of the Association for Computational Linguistics: EMNLP 2021, pp.  2001–2016. Association for Computational Linguistics, November 2021.
  46. Does human collaboration enhance the accuracy of identifying llm-generated deepfake texts? Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, 11(1):163–174, 2023.
  47. Can fighting misinformation have a negative spillover effect? how warnings for the threat of misinformation can decrease general news credibility. Journalism Studies, 24(6):803–823, 2023.
  48. I do not believe you: How providing a source corrects health misperceptions across social media platforms. Information, Communication & Society, 21(10):1337–1353, 2018.
  49. How to unring the bell: A meta-analytic approach to correction of misinformation. Communication monographs, 85(3):423–441, 2018.
  50. Networked narratives on humans of new york: A content analysis of social media engagement on facebook. Computers in human behavior, 66:149–153, 2017.
  51. Is rlhf more difficult than standard rl? a theoretical perspective. Advances in Neural Information Processing Systems, 36, 2024.
  52. Towards faithful neural table-to-text generation with content-matching constraints. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp.  1072–1086. Association for Computational Linguistics, July 2020.
  53. Defending against neural fake news. Advances in neural information processing systems, 32, 2019.
  54. AlignScore: Evaluating factual consistency with a unified alignment function. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.  11328–11348, Toronto, Canada, July 2023. Association for Computational Linguistics.
  55. A survey of large language models. arXiv preprint arXiv:2303.18223, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Mahjabin Nahar (4 papers)
  2. Haeseung Seo (2 papers)
  3. Eun-Ju Lee (6 papers)
  4. Aiping Xiong (8 papers)
  5. Dongwon Lee (65 papers)
Citations (6)

Summary

Exploring the Human Detection of LLM-Generated Hallucinations: The Role of Warning Cues

Introduction

The proliferation of LLMs such as GPT-3 has heightened concerns over their tendency to generate content that deviates from factual correctness, known as hallucinations. Hallucinations in LLM outputs, particularly in sensitive fields like legal and medical domains, present substantial risks due to the potential dissemination of inaccurate information. This paper explores understanding human ability to identify LLM hallucinations and examines the impact of warnings on the accuracy perception and engagement behaviors of users when presented with genuine and hallucinated content.

Human Detection of Hallucinations

The paper engaged 419 participants, using a structured approach to systematically vary the degree of hallucination in LLM-generated content. Participants were exposed to content classified into three categories: genuine, minor hallucination, and major hallucination, with and without warning cues about potential inaccuracies. The research focused on how these variations influenced participants' perceptions of content accuracy and their engagement actions (likes, dislikes, shares).

Key Findings

Impact of Warning on Perception and Engagement:

  • The presence of a warning sign notably improved participants' detection of hallucinated content without adversely affecting their perception of genuine content's accuracy.
  • Warnings led to a significant increase in the likelihood of content being disliked if it contained hallucinations, reinforcing the utility of warnings in enhancing skepticism towards dubious content. However, likes and shares were not significantly affected by warnings, hinting at a complex interplay of factors governing user engagement behaviors beyond mere content veracity.

Differential Human Reaction to Hallucination Levels:

  • Participants demonstrated a clear stratification in content truthfulness ranking, perceiving genuine content as the most accurate, followed by minor hallucinations, and major hallucinations as the least accurate.
  • This differentiation extended to engagement behaviors, with genuine content receiving more likes and shares, indicating a preference for accuracy in user interactions. Conversely, major hallucinations elicited the most dislikes, reflecting an intuitive aversion to clearly fabricated content.

Correlation between Perceived Accuracy and Engagement Behaviors:

  • A notable correlation was observed between the perceived accuracy of content and engagement actions. Content deemed more accurate was more likely to be liked and shared, indicating that the perceived truthfulness of information could be a significant driver of user engagement on digital platforms.

Implications and Future Directions

This paper underscores the potential of warnings as a simple yet effective tool to mitigate the risk of LLM hallucinations by enhancing human discernment capabilities without inducing undue skepticism towards genuine content. The insights from this research hold considerable practical relevance, especially for developers and policymakers focused on leveraging LLM technologies in information-sensitive arenas. Moreover, the findings provide a foundation for future exploratory work into identifying more nuanced human factors influencing the perception and dissemination of LLM-generated content. The paper also prompts an investigation into the creation and effectiveness of computational models to support human detection of hallucinations, potentially incorporating user feedback to refine model outputs continually.

Conclusion

The research provides critical insights into the human capacity to discern LLM-generated genuine content from hallucinated variants and highlights the efficacy of warning cues in enhancing discernment without compromising the perception of genuine content. As the adoption of LLM technologies continues to grow, understanding and improving human-machine interaction paradigms will be pivotal in harnessing the full potential of these models while safeguarding against the dissemination of misinformation.

X Twitter Logo Streamline Icon: https://streamlinehq.com