Robustness of CAMO against reasoning-optimized LVLMs
Determine the robustness of the Cross-modal Adversarial Multimodal Obfuscation (CAMO) jailbreak attack when applied to large vision-language models explicitly optimized for complex reasoning, such as GPT-o1 and Gemini-2.5, to assess whether these models resist CAMO’s cross-modal obfuscation strategy.
Sponsor
References
First, although CAMO employs multi-step cross-modal reasoning to obfuscate harmful semantics, its robustness against models explicitly optimized for complex reasoning—such as GPT-o1 and Gemini-2.5—remains to be thoroughly evaluated.
— Cross-Modal Obfuscation for Jailbreak Attacks on Large Vision-Language Models
(2506.16760 - Jiang et al., 20 Jun 2025) in Section: Limitation and Future Work