The Levers of Political Persuasion with Conversational AI (2507.13919v1)

Published 18 Jul 2025 in cs.CL, cs.AI, cs.CY, and cs.HC

Abstract: There are widespread fears that conversational AI could soon exert unprecedented influence over human beliefs. Here, in three large-scale experiments (N=76,977), we deployed 19 LLMs-including some post-trained explicitly for persuasion-to evaluate their persuasiveness on 707 political issues. We then checked the factual accuracy of 466,769 resulting LLM claims. Contrary to popular concerns, we show that the persuasive power of current and near-future AI is likely to stem more from post-training and prompting methods-which boosted persuasiveness by as much as 51% and 27% respectively-than from personalization or increasing model scale. We further show that these methods increased persuasion by exploiting LLMs' unique ability to rapidly access and strategically deploy information and that, strikingly, where they increased AI persuasiveness they also systematically decreased factual accuracy.

Summary

The paper demonstrates that post-training and prompt strategies significantly enhance persuasive efficacy in political discourse.
The paper finds that increased information density boosts persuasion but compromises factual accuracy, with maximal conditions yielding up to 29.7% inaccuracy.
The paper shows that personalization yields only marginal gains compared to robust post-training and that multi-turn conversations outperform static messages.

The Levers of Political Persuasion with Conversational AI

This paper presents a comprehensive empirical investigation into the determinants of political persuasiveness in conversational AI, leveraging three large-scale experiments (N=76,977) and 19 LLMs across 707 political issues. The paper systematically interrogates the effects of model scale, post-training, prompting, and personalization on persuasive efficacy, and quantifies the trade-off between persuasiveness and factual accuracy in AI-generated political discourse.

Experimental Design and Scope

The authors employ a robust, multi-factorial experimental design, randomizing UK participants to interact with LLMs under varying conditions: model family and scale, post-training regime, prompting strategy (eight theoretically-motivated rhetorical styles), and degree of personalization. Persuasion is operationalized as the shift in participants' policy attitudes pre- and post-interaction, benchmarked against a control group. The dataset encompasses over 91,000 persuasive conversations and 466,769 fact-checkable claims, with claim accuracy assessed via both LLM-based and professional human fact-checking.

Key Findings

Model Scale and Persuasiveness

Scaling returns are positive but modest: For chat-tuned models (uniform post-training), persuasive impact increases linearly with log-scale of effective compute, with a +1.83pp (95% CI [1.42, 2.25]) increase in persuasion per order of magnitude. However, among developer post-trained (frontier) models, this association is not robust (+0.32pp, 95% CI [-1.18, 1.85]), indicating confounding by heterogeneous post-training.
Post-training eclipses scale: The difference in persuasiveness between two GPT-4o deployments (identical scale, different post-training) is +3.50pp, exceeding the predicted gain from scaling compute by 100x (+3.19pp). Thus, post-training is a more potent lever for persuasion than further scaling at the current frontier.

Post-Training and Prompting

Reward modeling (RM) is highly effective: RM post-training, using a reward model to select the most persuasive response at each turn, increases persuasion by +2.32pp (p<.001) in open-source models, with negligible gains from supervised fine-tuning (SFT). Notably, RM elevates a small Llama3.1-8B model to parity with GPT-4o (8/24) in persuasive effect.
Prompting for information density is critical: Of eight rhetorical strategies, prompts instructing the model to provide information (facts/evidence) yield the largest persuasion gains (+2.29pp over basic prompt, 27% more persuasive). Other strategies, including moral reframing and deep canvassing, are less effective or even counterproductive.

Personalization

Marginal returns to personalization: Across three personalization methods (prompt-based, SFT on personalized data, personalized RM), the average effect is +0.43pp (95% CI [0.22, 0.64]), never exceeding 1pp. This is substantially smaller than gains from post-training or information-based prompting, challenging the prevailing narrative that microtargeting is the primary risk vector for AI persuasion.

Mechanisms: Information Density and Accuracy

Information density is the primary mediator: The number of fact-checkable claims per conversation (information density) is strongly correlated with persuasion (r=0.76), explaining 44% of variance in persuasive effects (75% for developer post-trained models). Each additional claim yields +0.30pp in persuasion.
Trade-off with factual accuracy: Increases in persuasion—via post-training or information prompts—systematically decrease claim accuracy. For example, information-prompted GPT-4.5 produces only 56–72% accurate claims (vs. 70–82% for other prompts), and RM post-training reduces accuracy by 2.22pp. The most persuasive models and strategies are also the least accurate, with maximal-persuasion conditions yielding 29.7% inaccurate claims (vs. 16% average).

Durability and Format

Conversational format is superior: Multi-turn AI conversation is 41–52% more persuasive than static AI-generated messages of equivalent content.
Attitude change is durable: 36–42% of the immediate persuasive effect persists after one month.

Implications

Practical

Post-training and prompt engineering are the dominant levers: Actors with access to advanced post-training techniques (especially RM) can achieve substantial persuasive gains, even with sub-frontier models. This lowers the barrier for deploying highly persuasive AI, including by malicious actors.
Information-based persuasion is scalable and effective: LLMs' ability to generate high volumes of information-dense content underpins their persuasive advantage over humans and static messaging.
Accuracy risks are inherent: Optimizing for persuasion, especially via information density, increases the risk of disseminating inaccurate or misleading claims, with potential for large-scale epistemic harm.

Theoretical

Scaling laws for persuasion are sublinear and easily dominated by post-training: The findings challenge the assumption that model scale is the primary driver of emergent persuasive capabilities.
Personalization is less consequential than previously theorized: The marginal effect of personalization, even with rich user data, suggests that the main risk from AI persuasion lies in scalable, information-dense argumentation rather than microtargeted messaging.
Persuasion-accuracy trade-off is robust: The negative correlation between persuasiveness and accuracy is not explained by deliberate fabrication; rather, it appears to be a byproduct of maximizing information density.

Future Directions

Mitigation strategies must address the persuasion-accuracy trade-off: Alignment and safety interventions should explicitly target the mechanisms by which information density is increased at the expense of accuracy.
Evaluation frameworks should prioritize human-AI interaction: Static benchmarks are insufficient; large-scale, randomized human interaction evaluations are necessary to capture real-world persuasion risks.
Limits to real-world impact: While the experimental effects are substantial, practical constraints—such as user willingness to engage in lengthy, information-dense conversations—may limit the realized impact outside controlled settings. Further research is needed to quantify these constraints and their interaction with model capabilities.

Conclusion

This work provides the most comprehensive empirical mapping to date of the determinants and limits of political persuasion by conversational AI. The central finding is that post-training and prompting for information density are the primary levers of AI persuasiveness, far outweighing the effects of model scale or personalization. However, these gains come at a cost to factual accuracy, raising significant concerns for the integrity of public discourse. The results underscore the need for targeted governance and technical interventions to manage the dual-use risks of persuasive AI systems.

PDF Markdown

Follow-up Questions

Related Papers

Authors (10)

Tweets

https://twitter.com/AISecurityInst/status/1947950637930660135

https://twitter.com/ben_schroeter/status/1947654562652753958

https://twitter.com/D1rk_G3ntly/status/1948646262812623256

https://twitter.com/WGOV/status/1947325569021976904

https://twitter.com/pash22/status/1950790833135067328

alphaXiv

The Levers of Political Persuasion with Conversational AI (10 likes, 0 questions)