Large Language Models can impersonate politicians and other public figures (2407.12855v1)

Published 9 Jul 2024 in cs.CL, cs.AI, and cs.LG

Abstract: Modern AI technology like LLMs has the potential to pollute the public information sphere with made-up content, which poses a significant threat to the cohesion of societies at large. A wide range of research has shown that LLMs are capable of generating text of impressive quality, including persuasive political speech, text with a pre-defined style, and role-specific content. But there is a crucial gap in the literature: We lack large-scale and systematic studies of how capable LLMs are in impersonating political and societal representatives and how the general public judges these impersonations in terms of authenticity, relevance and coherence. We present the results of a study based on a cross-section of British society that shows that LLMs are able to generate responses to debate questions that were part of a broadcast political debate programme in the UK. The impersonated responses are judged to be more authentic and relevant than the original responses given by people who were impersonated. This shows two things: (1) LLMs can be made to contribute meaningfully to the public political debate and (2) there is a dire need to inform the general public of the potential harm this can have on society.

Authors (4)

Steffen Herbold (42 papers)
Alexander Trautsch (13 papers)
Zlata Kikteva (2 papers)
Annette Hautli-Janisz (3 papers)

Summary

LLMs Can Impersonate Politicians and Other Public Figures

The paper "LLMs can impersonate politicians and other public figures" by Steffen Herbold, Alexander Trautsch, Zlata Kikteva, and Annette Hautli-Janisz tackles the potential and risks associated with the use of LLMs in political discourse. The authors rigorously explore the capability of LLMs, particularly ChatGPT, to impersonate public figures and generate political content that passes as authentic to the general public. This paper addresses crucial gaps in the literature by evaluating how the public perceives these generated impersonations in terms of authenticity, relevance, and coherence.

Study Overview

The central objectives of this paper are captured in two research questions: (1) how do citizens rate the authenticity, relevance, and coherence of LLM-generated debate responses compared to actual responses, and (2) what impact does exposure to these AI-generated responses have on the public's attitude towards AI in public debates. The researchers formed a dataset from thirty episodes of BBC's "Question Time," covering responses from a diverse set of public figures, and used this data to assess the LLM's impersonation capabilities.

Methodology

The paper utilized ChatGPT 4 Turbo to generate debate responses by prompting it with biographical information and specific questions previously addressed by the actual figures. A comprehensive survey was administered to a representative sample of British citizens (n=948), who evaluated these responses without knowing that some were AI-generated. The assessment metrics included authenticity, coherence, relevance, and content similarity, alongside the participants' familiarity with the speakers and their confidence in their judgments.

Results

Impersonation Credibility

The results presented a significant finding: LLM-generated responses were frequently rated higher in authenticity, coherence, and relevance compared to actual responses, as depicted in the statistical distributions and effect sizes (authenticity $d=0.66$ , coherence $d=1.25$ , relevance $d=1.23$ ). When participants directly compared actual and AI-generated responses, the perceived authenticity differences narrowed ( $d=0.22$ ), but the coherence and relevance advantages remained robust.

Content and Linguistic Analysis

Interestingly, the authenticity ratings were generally unaffected by content similarity, with many participants unable to distinguish between genuine and AI-generated responses even when the content differed significantly. Linguistic analysis revealed noteworthy stylistic differences: AI responses exhibited higher lexical diversity and nominalization rates, fewer epistemic markers, and a more substantial overlap between question and response vocabulary.

Public Perception

Public opinion on AI's role in public debates shifted post-exposure to AI's capabilities. Initially skeptical about the utility of AI in debates, participants' appreciation for AI increased after recognizing the quality of AI-generated responses. However, concerns about transparency and regulation were prominent, with over 85% of participants advocating for explicit disclosure and comprehensive regulation of AI use in public discourse.

Implications

The implications of these findings are multifaceted. Practically, the ability of LLMs to generate coherent and relevant political discourse at scale poses risks of misinformation and authenticity confusion. Theoretically, the paper challenges the understanding of human-AI interaction in political communication, suggesting the necessity for stronger content moderation mechanisms and regulatory frameworks.

Conclusion and Future Directions

This research underscores the dual-edged nature of LLMs in political discourse. While they represent an advancement in text generation technologies, their potential to undermine trust and authenticity in political communication is concerning. Future research should focus on enhancing detection methods for AI-generated content and developing policies that ensure the responsible use of AI in public spheres. As LLMs continue to evolve, their role in shaping public opinion necessitates careful examination and proactive governance.

PDF Markdown

Related Papers

Tweets

https://twitter.com/koenfucius/status/1816386913583927411

https://twitter.com/realmofresearch/status/1814560582898753646