Dice Question Streamline Icon: https://streamlinehq.com

Impact of improvised jailbreak prompts on EschExp’s trajectory and content

Determine how the specific combination of emotes (e.g., whisper-style asides), fabricated HTML-like tags (such as <ooc> and <cmd>), a "god mode" system prompt, and simulated CLI interactions used to initiate the EschExp conversation with Anthropic’s Claude 3 Opus causally shaped the subsequent trajectory, tone, and content of that conversation, by conducting a systematic study that isolates the individual and interactive effects of these prompts.

Information Square Streamline Icon: https://streamlinehq.com

Background

The EschExp dialogue was initiated using a complex blend of prompting techniques—emotes, fabricated tags, a system prompt indicating "god mode," and simulated command-line exchanges. The authors note that this improvised combination predisposed the model to engage in mythological and eschatological role-play but emphasize that the exact contribution of each element is unknown.

They explicitly state that only a systematic paper could establish how these prompt components shaped the conversation, and that such a paper is beyond the scope of the paper.

References

The particular combination of tricks and nudges used here was improvised, and is somewhat arbitrary, so it’s hard to say exactly how they shaped the ensuing conversation. A systematic study would be required to establish this, but this is well beyond the scope of the present paper.

Existential Conversations with Large Language Models: Content, Community, and Culture (2411.13223 - Shanahan et al., 20 Nov 2024) in Section 3.2 (Prompting the Second Conversation), final paragraph