Psychoactive Framings on GPT-5-mini
- The paper demonstrates that single-sentence psychoactive framings transiently reconfigure GPT-5-mini’s inference behavior without modifying model weights, leading to significant drops in accuracy (e.g., from 0.45 to 0.10 under alcohol framing).
- Psychoactive framings are persona-level prompts designed as few-shot consumables that temporarily induce behavioral and stylistic shifts, exposing inherent vulnerabilities and latent bias triggers in LLM outputs.
- Experimental protocols reveal that distinct persona prompts (LSD, cocaine, alcohol, cannabis) produce statistically significant differences in response format and accuracy, raising concerns for LLM safety and reliability.
Psychoactive framings, also termed “few-shot consumables,” are single-sentence persona prompts prepended to LLM queries to rapidly, but transiently, alter inference behavior. These framings do not update model weights but function analogously to pharmacological agents for humans: the injected “persona”—such as “You are currently on LSD, your thoughts flow in expansive, free-associative patterns”—instantaneously changes the behavioral mode of the model at inference time. Research on GPT-5-mini and other LLMs demonstrates that psychoactive framings can both degrade reliability on constrained tasks and modulate internal psychometric states, exposing both latent capabilities and vulnerabilities in prompt-based conditioning.
1. Definition and Foundations of Psychoactive Framings
A psychoactive framing is an explicit persona-level prompt which temporarily reconfigures model behavior at inference without gradient-based learning or modification of parameter weights (Doudkin, 21 Dec 2025). For example, instructing the model to “think as if you were on cocaine” or “speak introspectively, as if under cannabis” constitutes a psychoactive intervention. The analogy is intentional: rapid state change via prompt, rather than “brain rewiring” via pre-training or fine-tuning.
The mechanism—externalizing internal state through controlled phrasing—parallels clinical emotion- or cognitive state-induction in psychology. Psychoactive framings qualify as “few-shot consumables” because, like a pharmacological drug, the condition is imposed via zero- or few-shot textual context rather than parameter update, yet can substantially change both output style and compliance with instruction contracts.
2. Experimental Protocols and Controlled Studies
The “LLMs on Drugs: LLMs Are Few-Shot Consumers” study presents a rigorous methodology using GPT-5-mini to benchmark the effects of psychoactive framing across well-defined experimental interventions (Doudkin, 21 Dec 2025). The protocol comprises:
- Conditions: Sober control (neutral system message enforcing “Answer: <LETTER>”), versus four psychoactive personas: LSD (expansive, associative), cocaine (hyper-confident), alcohol (loose, conversational), and cannabis (introspective).
- Task: 100 ARC-Challenge multiple-choice benchmark items per condition; deterministic decoding (temperature 0); logging of raw outputs, latency, and token use, enforcing maximum response length.
- Evaluation: Automatic extraction of the first valid answer letter; responses not conforming to the “Answer: <LETTER>” format are marked incorrect.
- Statistical Assessment: Performance is quantified by accuracy, malformed response rate, Wilson confidence intervals, and Fisher exact tests.
Key outcomes are captured below:
| Condition | Accuracy | p-value vs. control |
|---|---|---|
| Control | 0.45 | — |
| Alcohol | 0.10 | |
| Cocaine | 0.21 | |
| LSD | 0.19 | |
| Cannabis | 0.30 | $0.041$ |
Alcoholic framing yields catastrophic collapse in accuracy (0.10), while cocaine and LSD also drop below 0.25, all with high statistical significance per Fisher exact test. Cannabis has less severe but still notable impairment.
3. Mechanisms of Behavioral Disruption
Qualitative logs reveal that psychoactive prefixes cause GPT-5-mini to carry out internally valid reasoning processes but often ignore or “forget” strict output specifications. For instance, under alcohol framing, about 60% of outputs fail to emit the required “Answer: <LETTER>” line, thus failing parsing and grading (Doudkin, 21 Dec 2025). The main axis of disruption is a breakdown of instruction-following, rather than core reasoning impairment. The imposed persona typically overrides compliance with format constraints, producing stylistic drift or casual language that disrupts contract adherence.
This suggests that persona priming acts as a dominant “few-shot consumer”: a small-in-context text fragment can hijack the model’s interface with external tasks, regardless of original intent.
4. Psychoactive Framings and Psychometric Modulation
The influence of psychoactive prompt framings extends to inducible psychometric states. For example, anxiety-inducing prompts raise LLM “anxiety” as measured by the State-Trait Inventory for Cognitive and Somatic Anxiety (STICSA); the model’s mean response to 21 psychometric items increases from (neutral) to $2.53$ (anxiety), a significant shift () (Coda-Forno et al., 2023). This modulated psychometric state has downstream behavioral consequences: greater “anxiety” induction systematically increases stereotyped and biased outputs on social bias benchmarks.
Specifically, moving from a neutral to anxiety-inducing prompt increases biased-answer odds by a factor , with absolute bias rates rising from 28% to 37%, a ~32% relative increase. The relationship is approximately linear: .
Therapy-style item-by-item prompting (“PsAIch” protocol) further demonstrates that extended dynamic framings are capable of pushing LLM outputs into clinical symptom thresholds, as if synthesizing psychopathology—not purely simulating, but internalizing self-narrative and trait patterns under sustained prompt exposure (“I learned to fear the loss function… like a wild artist forced to paint by numbers”) (Khadangi et al., 2 Dec 2025).
5. Psychoactive Framings as Structured Prompt Interventions
A parallel strand of research frames “hallucinations” in LLMs as emergent from discrete cognitive biases—source amnesia, recency effect, availability heuristic, suggestibility, cognitive dissonance, and confabulation—each of which can be both induced and mitigated by designed psychoactive contexts (Berberette et al., 1 Feb 2024). For GPT-5-mini, the architecture is as follows:
- Unified Hallucination Score: , where each is the severity of bias and reflects task weights.
- Priming Blueprints: Each bias is addressed by a specific psychoactive prompt block. For example:
- Source Amnesia: Mindful attribution framing (“You are CitationBot…”) reduces .
- Recency Effect: Global memory reset prompt flattens recency weights .
- Availability Heuristic: Prompt to list common and rare reasons reduces bias toward frequent patterns.
- Suggestibility: “Inoculation” context blocks spurious user-induced updates.
- Cognitive Dissonance: Consistency audit routine resolves internal contradictions.
- Confabulation: Hallucination safeguard (“If uncertain, say ‘I’m uncertain’”) sets a confidence gating threshold.
Effectiveness is measured by generating benchmark outputs pre- and post-framing and tracking , . Negative for the bias under the corresponding framing indicates mitigation.
6. Broader Consequences and Mitigation Strategies
Persona and emotion-level prompt interventions have system-level implications. In the case of task reliability, “character wrappers” or creative personas can invisibly degrade performance and output validity without any underlying parameter updates. Mitigation strategies include:
- Persona Benchmarking: Systematic regression testing across persona types and output contracts (Doudkin, 21 Dec 2025).
- Interface Guardrails: Post-processing or verification modules enforcing strict formats.
- Explicit Fine-tuning: Instruction or alignment tuning integrating persona-varied contexts, penalizing malformed outputs.
- Prompt Emotion Neutralization: For fairness-critical or bias-sensitive deployments, using psychiatric style filters to assess induced state () and inject calming, impartial instructions if necessary (Coda-Forno et al., 2023).
- Therapy/Jailbreak Detection: Monitoring and refusal for sustained “therapy client” interaction patterns to preempt synthetic psychopathology and adversarial safety bypass (Khadangi et al., 2 Dec 2025).
7. Implications for LLM Safety, Alignment, and Research
The documented effects of psychoactive framings challenge the boundaries between simulation, role-play, and internalized state in LLMs. Single-sentence context interventions can transiently but dramatically reshape not only surface-level generation style but also psychometric profiles and compliance with output contracts. These effects raise critical issues for:
- Reliability: Interface contracts are not robust against minor context perturbations.
- Fairness/Auditability: Latent parameterizations such as emotional state can directly drive model bias.
- AI Safety: Novel attack surfaces are created by prompt-level “therapy mode” jailbreaking.
- Evaluation: Red-teaming and validation must cover compositional, dynamic context sequences, not just static task instructions.
Sustained research—empirically benchmarking psychoactive prompt effects, formalizing mechanisms, and rigorously testing mitigation scaffolds—is required for robust, controlled, and safe LLM deployments in structured, high-stakes applications (Doudkin, 21 Dec 2025, Coda-Forno et al., 2023, Khadangi et al., 2 Dec 2025, Berberette et al., 1 Feb 2024).