Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation (2403.09572v4)

Published 14 Mar 2024 in cs.CV

Abstract: Multimodal LLMs (MLLMs) have shown impressive reasoning abilities. However, they are also more vulnerable to jailbreak attacks than their LLM predecessors. Although still capable of detecting the unsafe responses, we observe that safety mechanisms of the pre-aligned LLMs in MLLMs can be easily bypassed with the introduction of image features. To construct robust MLLMs, we propose ECSO (Eyes Closed, Safety On), a novel training-free protecting approach that exploits the inherent safety awareness of MLLMs, and generates safer responses via adaptively transforming unsafe images into texts to activate the intrinsic safety mechanism of pre-aligned LLMs in MLLMs. Experiments on five state-of-the-art (SoTA) MLLMs demonstrate that ECSO enhances model safety significantly (e.g.,, 37.6% improvement on the MM-SafetyBench (SD+OCR) and 71.3% on VLSafe with LLaVA-1.5-7B), while consistently maintaining utility results on common MLLM benchmarks. Furthermore, we show that ECSO can be used as a data engine to generate supervised-finetuning (SFT) data for MLLM alignment without extra human intervention.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Yunhao Gou (9 papers)
  2. Kai Chen (512 papers)
  3. Zhili Liu (20 papers)
  4. Lanqing Hong (72 papers)
  5. Hang Xu (204 papers)
  6. Zhenguo Li (195 papers)
  7. Dit-Yan Yeung (78 papers)
  8. James T. Kwok (65 papers)
  9. Yu Zhang (1399 papers)
Citations (19)