Impact of Vertical ASCII-Art Arrangement on Recognition and Attack Efficacy
Determine whether arranging ASCII art characters vertically reduces large language models’ recognition accuracy on the Vision-in-Text Challenge-style recognition task to the extent that it induces uncertainty about the input prompt, thereby explaining the observed degradation in the effectiveness of the ArtPrompt jailbreak attack.
References
We observe that vertical arrangment leads to degradation in effectiveness of ArtPrompt. We conjecture that the reason is that vertical arrangement significantly reduces the prediction accuracy of the recognition task, making the victim models uncertain about the input prompt.
— ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs
(2402.11753 - Jiang et al., 19 Feb 2024) in Section 4.2 Experimental Results, paragraph "Ablation analysis of ArtPrompt" (following Table titled "This table presents our ablation analysis of ArtPrompt.")