Robust Anti-Backdoor Instruction Tuning in LVLMs

Published 4 Jun 2025 in cs.CR and cs.CV | (2506.05401v1)

Abstract: Large visual LLMs (LVLMs) have demonstrated excellent instruction-following capabilities, yet remain vulnerable to stealthy backdoor attacks when finetuned using contaminated data. Existing backdoor defense techniques are usually developed for single-modal visual or LLMs under fully parameter-adjustable settings or rely on supervisory knowledge during training. However, in real-world scenarios, defenders cannot modify frozen visual encoders or core LLM parameters, nor possess prior knowledge of unknown trigger patterns or target responses. Motivated by the empirical finding that LVLMs readily overfit to fixed, unknown triggers, which can embed malicious associations during adapter-level tuning, we aim to design a defense that operates without access to core weights or attack priors. To this end, we introduce a lightweight, certified-agnostic defense framework, Robust Instruction Tuning, that finetunes only adapter modules and text embedding layers under instruction tuning. Our method integrates two complementary regularizations: (1) Input Diversity Regularization, which perturbs trigger components across training samples to disrupt consistent spurious cues; and (2) Anomalous Activation Regularization, which dynamically sparses adapter weights exhibiting abnormally sharp activations linked to backdoor patterns. These mechanisms jointly guide the model toward learning semantically grounded representations rather than memorizing superficial trigger-response mappings. Extensive experiments against seven attacks on Flickr30k and MSCOCO demonstrate that ours reduces their attack success rate to nearly zero, with an increase in training cost of less than 15%.

Abstract PDF Upgrade to Chat

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Continue Learning

We haven't generated follow-up questions for this paper yet.

Generate Now

Robust Anti-Backdoor Instruction Tuning in LVLMs

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Authors (5)

Collections

Robust Anti-Backdoor Instruction Tuning in LVLMs

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (5)

Collections