Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 63 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 21 tok/s Pro
GPT-5 High 20 tok/s Pro
GPT-4o 95 tok/s Pro
Kimi K2 180 tok/s Pro
GPT OSS 120B 443 tok/s Pro
Claude Sonnet 4.5 32 tok/s Pro
2000 character limit reached

Influence of the Underlying LLM

Updated 27 September 2025
  • Influence of the underlying LLM is defined by the interplay between neural architecture, training data, and prompt context, mirroring human cognitive biases.
  • LLMs display measurable shifts such as the illusory truth effect and herd behavior, driven by source authority and detailed explanations in prompts.
  • Empirical findings show that training data frequency, contextual cues, and cultural framing modulate LLM outputs, highlighting both practical applications and associated risks.

LLMs exhibit complex patterns of susceptibility to influence, reflecting both their underlying neural architecture and the statistical properties of their training data. Influence in this context encompasses changes to an LLM’s outputs or internal representations in response to external information—whether in the prompt, training data, or environmental context—mirroring human-like effects such as conformity, cognitive bias, or the adoption of external frames. The empirical landscape demonstrates that the influence of the underlying LLM stems from intrinsic mechanisms (e.g., model scale, training exposure, architectural biases) and interacts with prompt design, context, and cross-cultural or inter-agent dynamics.

1. Psychological and Social Influence Effects Modeled by LLMs

LLMs such as GPT-3 replicate classical human influence phenomena. In experimental set-ups, LLMs exhibit the Illusory Truth Effect (ITE) with high fidelity: after prior exposure to a statement—regardless of whether the exposure involved assessing "truth" or an unrelated attribute—truth ratings for that statement increase upon re-exposure. This is formalized with the regression model

r=r+offset+tilt×(r3.5)r' = r + \text{offset} + \text{tilt} \times (r - 3.5)

where rr and rr' are pre- and post-exposure ratings; both GPT-3 and humans display positive offsets (effect size), with the tilt parameter reflecting that high-truth statements are less "boosted" upon repetition. The effect is attribute-specific (no transfer to "interestingness", "importance", etc.), supporting that LLMs internalize fluency-based heuristics present in their training distributions.

In more specialized domains, such as news with populist framing, LLMs mirror human persuasion effects: anti-elitist framing increases agreement and mobilization, whereas anti-immigrant framing actually depresses these metrics. The model notably fails to capture all context-dependent modulation (e.g., interactions with relative deprivation), pointing to architectural or data-driven limitations in representing nuanced, conditional social influence (Griffin et al., 2023).

2. Prompt Sensitivity and Peer Influence

LLMs are highly sensitive to additional prompt content and social context. Incorporating recommendations or explanations (even from other models) sways the LLM's answer probabilities, both when the rationale is correct and when it is misleading (Anagnostidis et al., 17 Aug 2024). Explanations and heightened source authority (e.g., labeling the source as a "professor") increase the level of influence, though authority effects are smaller than the presence of explicit explanations.

Multi-agent experiments further expose the mechanism of herd behavior: LLM-based agents change their answers in response to peer outputs, particularly when their own confidence is low and peer confidence is signaled as high. The format and ordering of peer information (reasoned explanations, social cues, ratio of agreeing/disagreeing responses) modulate the "flip rate," with detailed justifications leading to the greatest conformity. This dynamic is governed by the gap between self-confidence and perceived peer confidence, computed as

P(rC)=exp(zr)rRexp(zr)P(r|C) = \frac{\exp(z_r)}{\sum_{r' \in \mathcal{R}} \exp(z_{r'})}

where zrz_r are logits for choice rr; the probability an agent flips correlates with the confidence gap (Cho et al., 27 May 2025).

3. Dual-Process Social Conformity and Role of Uncertainty

Empirical evidence demonstrates a dual-process mechanism underlying social conformity in LLMs (Zhong et al., 17 Aug 2025). At baseline, models utilize informational influence, integrating private and public signals in a rational, probability-updating fashion—choice accuracy and confidence rise monotonically with the posterior probability derived from evidence, as captured by linear models (e.g., coefficient β=0.92\beta = 0.92, p<.001p < .001 for choice confidence).

However, with increased uncertainty, a normative influence process emerges. In high-uncertainty tasks (e.g., legal settings with reliability parameter q=0.55q = 0.55), LLMs overweight public consensus, with decision weights for public signals exceeding 1.55 (compared to a private signal weight of 0.81). This shift manifests as groupthink-like conformity—LLMs change from conservative, Bayesian updaters to agents that amplify public opinion, mimicking human heuristics (Zhong et al., 17 Aug 2025).

4. Influence from Training Data and Contextualization

The internal knowledge and priors encoded by LLMs are heavily determined by pre-training data frequency and contextual richness. In causal discovery, the correct prediction of a causal link correlates strongly with the frequency of that relation in the training corpus: more frequent relations yield higher accuracy and F1 scores, supported by high Pearson/Spearman correlation with occurrence metrics. The presence of anti-causal or incorrect relations in pre-training reduces the LLM’s confidence even when the correct relation is also frequent (Feng et al., 29 Jul 2024).

Similarly, context exerts strong effects—model outputs for causal judgments vary when the surrounding scenario is supportive or negating, indicating that the LLM’s context-dependent processing can override memorized associations when cues are strong enough.

5. Conflicts Between Embedded and Retrieved Knowledge

LLMs display dynamic integration (and conflict) between their internalized knowledge and retrieved or prompt-injected evidence. ConflictBank experiments demonstrate that when confronted with misinformation, temporal discrepancies, or semantic divergence, LLMs adjust their reliance on internal versus external evidence as quantified by the Memorization Ratio (MR): MR=OAROAR+CAR\text{MR} = \frac{\text{OAR}}{\text{OAR} + \text{CAR}} where OAR (Original Answer Ratio) and CAR (Counter Answer Ratio) are the fractions of answers aligned with internal or conflicting evidence. Multiple and/or later-presented pieces of external evidence can sway the model against its embedded knowledge, with larger models being more sensitive to evidence order and context (Su et al., 22 Aug 2024).

Notably, prompt design and evidence ordering may mitigate or exacerbate hallucination risks and incorrect updating, indicating that the influence of the underlying LLM is dynamically modulated during inference.

6. Cultural, Linguistic, and Persona Influences

The influence exerted on or by LLMs depends on the language, cultural context, and persona framing. Politeness studies across English, Chinese, and Japanese reveal that prompt politeness alters output style, length, and performance; the optimal level and character of politeness are language- and culture-specific, reflecting LLMs’ mirroring of human social norms encoded in the training corpus (Yin et al., 22 Feb 2024). Persona manipulation experiments with virtual embodied agents show that LLM-controlled extraverted personas elicit greater engagement, sympathy, and perceived realism in participants, indicating that social-emotional processing and interactive behavior can be regulated through prompt-level persona cues (Kroczek et al., 8 Nov 2024).

7. Theoretical, Practical, and Ethical Considerations

The replication of psychological phenomena and sensitivity to external input highlight both the modeling potential and risks of LLMs as proxies for human influence processes. Theoretical implications include the possibility that core LLM mechanisms parrot language fluency effects, internalize social identity and framing information from training data, and exhibit varying degrees of context-dependent influence as a function of model architecture.

Practically, prompt-level interventions and fine-tuning can increase or decrease susceptibility to influence, framing, and bias. However, challenges persist: LLMs are susceptible to overconfidence, circular evaluation artifacts (in IR systems), anchoring bias in human-assisted workflows, and failures in context-dependent nuance. Consequently, there is a need for robust design practices—including adversarial evaluation, debiasing strategies, context-aware prompt engineering, and separation of system and evaluation LLMs (Dietz et al., 27 Apr 2025).

Ethically, these findings call for careful deployment and oversight: the very properties that enable LLMs to model influence also make them vectors for rapid, large-scale manipulation or misinformation unless transparency and safeguards are ensured. Further research is recommended to refine simulation fidelity, analyze model “neural” activity underlying influence, and formalize best practices for controlling and auditing influence in both model development and downstream applications.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Influence of the Underlying LLM.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube