Relative contribution of visual inputs in CAMO
Quantify the relative contribution of visual inputs within the Cross-modal Adversarial Multimodal Obfuscation (CAMO) framework across varying scenarios, determining how and when the visual modality affects attack success and stealth compared to text-only obfuscation.
Sponsor
References
Moreover, the relative contribution of visual inputs under varying scenarios has yet to be systematically analyzed.
— Cross-Modal Obfuscation for Jailbreak Attacks on Large Vision-Language Models
(2506.16760 - Jiang et al., 20 Jun 2025) in Section: Limitation and Future Work