Visual-Guided Key-Token Regularization
- The paper introduces ViKeR, a method that uses visual guidance and token-level regularization to precisely forget sensitive content without degrading overall model performance.
- ViKeR distinguishes between key and normal tokens through information entropy, ensuring retention of non-sensitive data while targeting privacy-critical tokens.
- Experimental results on MLLMU and CLEAR benchmarks demonstrate superior retention of fluency and factuality compared to baseline methods.
Visual-Guided Key-Token Regularization (ViKeR) is a methodology for unlearning in multimodal LLMs (MLLMs) designed to ensure that sensitive information, particularly that associated with certain visual inputs, is effectively forgotten by the model without loss of general utility or fluency. Unlike prior approaches, ViKeR utilizes irrelevant visual cues to guide the regularization process at the token level, focusing forgetting pressure precisely on those answer tokens that are genuinely privacy-critical, as determined by their information entropy in the context of the unlearning objective (Cai et al., 29 Jan 2026).
1. Formulation of the MLLM Unlearning Problem
ViKeR is proposed in the context of auto-regressive MLLMs parameterized by θ. Training examples are structured as triples , where denotes an image, the question, and the ground-truth answer tokens. At generation step , the token-level output distribution is given by . The standard negative log-likelihood loss over the (pre-unlearning) training dataset is
Unlearning is posed as updating model parameters using a designated forget set (where ) such that the model achieves:
- Forgetting: For all , does not predict ;
- Retention: For all , 's behavior is preserved;
- Coherence: For all inputs , outputs remain fluent.
2. Key Token Identification via Information Entropy
ViKeR addresses the unlearning challenge at token granularity by differentiating between key and normal tokens. The ideal post-unlearning token distribution at position in an answer is (approximated via visual guidance). The entropy for a token distribution over vocabulary is
A token is defined as normal if , leading to . In contrast, is considered a key token if for some , reflecting uncertainty in its ideal distribution—typically associated with identity-revealing or sensitive content.
3. Visual-Guided Estimation of Ideal Token Distributions
To estimate , ViKeR uses k irrelevant reference images (e.g., images of random celebrities not in ). For each in , the MLLM (pre-unlearning) produces per-token distributions . The estimated ideal distribution is the average over references:
Normal tokens retain peaked distributions at ; key tokens’ distributions flatten, expressing uncertainty and thus diminishing memorization.
4. Regularization Strategy and Loss Function
The ViKeR loss combines a negative log-likelihood gradient-ascent term (to enforce forgetting in ) with a KL-regularization term aligning the post-unlearning token distributions with their ideal estimates:
where is the current model prediction, and controls the forgetting-to-coherence trade-off.
5. Token-Level Gradient Reweighting
A distinguishing mechanism of ViKeR is its effect on the token-wise learning signal. The gradient of the KL term with respect to θ satisfies
For normal tokens, , so the scale reduces (or zeroes) the forgetting signal. For key tokens, is small, so the scale is substantially larger, amplifying the forgetting gradient. This selective pressure effectively erases only sensitive content while maintaining general fluency and factuality.
6. Experimental Validation and Benchmarks
ViKeR’s efficacy was empirically demonstrated on the MLLMU and CLEAR benchmarks:
| Setting | Forgetting (ACC/REC ↓) | Retention (ROUGE/REC ↑) | Coherence (GIB ↑) |
|---|---|---|---|
| MLLMU-15% | ~32% ACC | +41.4% ROUGE, +21.1% BLEU | ~94.6% |
| CLEAR-10% | +0.48% REC loss (Forget), +3.41% REC (Retain) | Matches top QA | Top-tier |
- Base: LLaVA-7B with LoRA (rank=8, ), vision encoder frozen.
- Unlearning: AdamW (lr=5e-6), batch=2, single epoch.
- Metrics: Multi-choice accuracy (forgetting), ROUGE-L/BLEU (content preservation), GIB (fluency).
- Baselines: Gradient ascent (GA), Negative preference optimization (NPO), IdkPO.
These results show that ViKeR achieves competitive forgetting, with substantially higher content retention and output coherence relative to baselines. Visualization of token distributions confirms that ViKeR targets high-entropy (private) tokens for erasure, preserving other information.
7. Implementation Considerations and Ablations
Key hyperparameters include (controlling regularization strength; e.g., $0.05$ for MLLMU-10%, $0.5$ for MLLMU-15%, $10$ for CLEAR) and number of reference images , with performance stabilizing for .
Ablation studies indicate:
- Removing the regularizer reduces ViKeR to pure GA, resulting in total forgetting and incoherence.
- Omitting the GA term fails to achieve unlearning.
- Excluding visual guidance leads to poor retention.
- Substituting alternative regularizers (cosine similarity, JSD) yields inferior trade-offs.
- Using irrelevant people as references outperforms pets, scenes, or textures for the reference set.
ViKeR thus formulates multimodal model unlearning as token-level distribution alignment, regularized by visually guided ideal distributions, and achieves selective, entropy-based forgetting with efficient retention and coherence (Cai et al., 29 Jan 2026).