Papers
Topics
Authors
Recent
2000 character limit reached

SMFA: Sculpted Memory Forgetting Adapter

Updated 2 December 2025
  • The paper presents SMFA's novel method to confine unlearning to specific memory regions using loss-driven adapters and dynamic masks for precise concept erasure.
  • It leverages domain-specific techniques in text-to-image diffusion and multimodal language models, applying concept-aware losses and LoRA-style adapters for controlled updates.
  • Benchmark results show SMFA achieves significant forgetting efficiency with high retention of unrelated knowledge, outperforming previous unlearning baselines.

The Sculpted Memory Forgetting Adapter (SMFA) is a specialized unlearning framework for controlling knowledge retention and deletion in large-scale neural models. In text-to-image diffusion models and multimodal LLMs (MLLMs), SMFA enables selective “erasure” of sensitive or undesirable concepts while rigorously preserving unrelated functional capabilities. The framework was independently presented for both diffusion-based image synthesis and MLLMs, with modality-specific instantiations. Both share a central paradigm: confining model updates to memory regions responsible for the targeted concepts by combining loss-driven adapters and precision masking mechanisms (Li et al., 12 Apr 2025, Zeng et al., 25 Nov 2025).

1. Architectural Principles

SMFA is an adapter-based unlearning method, but the adapter’s construction and application are tailored to the model domain.

In T2I diffusion settings, SMFA augments the pre-trained U-Net with a dynamic, sparsity-inducing gradient mask and a composite loss function:

  • Dynamic Mask (Mdyn{0,1}dim(θ)M_\text{dyn} \in \{0,1\}^{\dim(\theta)}): Gates U-Net parameters, restricting updates to weights implicated by the current unlearning objective.
  • Concept-Aware Loss (Ltotal\mathcal{L}_\text{total}): Guides unlearning, aligning forgotten concepts to user-defined superclasses and regularizing via knowledge distillation.

In MLLMs, SMFA operates by fine-tuning a frozen base model (W0W_0) with a Memory Forgetting Adapter (ΔWf\Delta W_f) that is further sculpted using a Retaining Anchor (ΔWa\Delta W_a):

  • MFA: ΔWf\Delta W_f is trained to transform all forgotten responses (on Df\mathcal{D}_f) to explicit refusals (from a large label bank), while preserving few-shot retained samples (Drfew\mathcal{D}_r^\text{few}).
  • Masking: The update ΔWf\Delta W_f is masked to zero-out weights where the forgetting and retaining anchors conflict in sign and magnitude, thus localizing memory changes.

Both implementations eschew wholesale model retraining in favor of controlled, local adaptation (Li et al., 12 Apr 2025, Zeng et al., 25 Nov 2025).

2. Dynamic Masking and Gradient Control

The SMFA for diffusion models introduces a three-stage dynamic mask pipeline:

  • Gradient Masking: Each update computes the loss gradient g(t)=θLtotalg^{(t)} = \nabla_\theta \mathcal{L}_\text{total}, multiplies it elementwise by the mask Mdyn(t)M_\text{dyn}^{(t)}, and updates weights only where the mask is active.
  • Accumulated Gradient Statistics: A running sum of absolute gradients A(t+1)=A(t)+g(t)A^{(t+1)} = A^{(t)} + |g^{(t)}| is maintained to inform mask updates.
  • Scheduled Mask Reallocation: Every UU steps, a fraction τ(t;rm,Tend)=(rm/2)[1+cos(πt/Tend)]\tau(t; r_m, T_\text{end}) = (r_m/2)[1+\cos(\pi t/T_\text{end})] of active and inactive mask entries are swapped based on AA statistics, enforcing sustained sparsity and adaptability.

In MLLMs, masking is formulated over learned LoRA adapters:

  • Directional Conflict Mask (CijC_{ij}): Cij=1C_{ij}=1 if ΔWa,ijΔWf,ij<0\Delta W_{a,ij}\cdot\Delta W_{f,ij} < 0, otherwise 0.
  • Relative Magnitude Mask (RijR_{ij}): Rij=1R_{ij}=1 if kρΔWa,ij<ΔWf,ijk\rho|\Delta W_{a,ij}| < |\Delta W_{f,ij}|, with ρ=ΔWfF/(ΔWaF+ϵ)\rho = \|\Delta W_f\|_F/(\|\Delta W_a\|_F+\epsilon).
  • Final Mask (MM): M=CRM=C\odot R.
  • Sculpted Adapter: ΔWf\Delta W_f' is obtained by zeroing out masked entries: ΔWf=ΔWf(1M)\Delta W_f' = \Delta W_f \odot (1-M).

This ensures updates effect forgetting only in memory regions implicated by the “forget” task, mitigating collateral damage to unrelated knowledge (Li et al., 12 Apr 2025, Zeng et al., 25 Nov 2025).

3. Concept-Aware and Selective Loss Functions

For diffusion models, SMFA employs a three-part objective:

  • Unlearning Loss (Lunlearn\mathcal{L}_\text{unlearn}): Penalizes the U-Net when predictions for the target (to-be-forgotten) concept CC do not match those for its mapped superclass CsC_s:

Lunlearn(θ)=E(x,c,cs),t,ϵ[ϵ^θ(xt,cs,t)ϵ^θ(xt,c,t)2]\mathcal{L}_\text{unlearn}(\theta) = \mathbb{E}_{(x,c,c_s), t, \epsilon}\left[ \| \hat{\epsilon}_\theta(x_t, c_s, t) - \hat{\epsilon}_\theta(x_t, c, t) \|^2 \right]

  • Superclass Alignment (Lalign\mathcal{L}_\text{align}): Ensures the U-Net’s output for CsC_s prompts aligns with the true underlying noise, maintaining output semantic structure.
  • Knowledge-Distillation Regularization (Lreg\mathcal{L}_\text{reg}): Locks in prior unlearned concepts by aligning current predictions to those of a snapshot teacher on all previously erased CsC_s prompts.

The total loss is

Ltotal=Lunlearn+αLalign+βLreg\mathcal{L}_\text{total} = \mathcal{L}_\text{unlearn} + \alpha \cdot \mathcal{L}_\text{align} + \beta \cdot \mathcal{L}_\text{reg}

with typical α0.25\alpha \approx 0.25, β0.25\beta \approx 0.25.

In MLLMs, SMFA minimizes standard cross-entropy over refusal-labeled forget sets (Dfidk\mathcal{D}_f^{\text{idk}}) and few-shot retained examples (Drfew\mathcal{D}_r^\text{few}). The regularization is imposed via hard masking of adapter updates rather than as a differentiable term (Li et al., 12 Apr 2025, Zeng et al., 25 Nov 2025).

4. Optimization and Training Protocol

The optimization for multi-concept forgetting is sequential and relies on model snapshots:

  1. Initialization: Pre-trained model parameters are loaded; running statistics (AA) for the mask are optionally warmed up.
  2. Per-Concept Forgetting:
    • For each concept CC, copy the current model as the “teacher” for distillation regularization.
    • Fine-tune on (x,c,cs)(x, c, c_s) and (x,cs)(x, c_s) batches, compute all loss terms, update using masked gradients.
    • Update mask every UU steps by reallocating based on AA.
    • Discard teacher after concept completion.
  3. Output: Final model reflects all target forgets, with unrelated concepts and functionalities preserved.

In MLLMs, LoRA adapters are learned for both forget and retain anchors, only modifying the adapter weights (ΔWf,ΔWa\Delta W_f, \Delta W_a) with the base model fixed. Masking is applied post-hoc to ΔWf\Delta W_f using the directional/magnitude scheme, yielding the sculpted adapter. The process is compatible with various model sizes (e.g., Qwen2.5-VL-7B, LLaVA-OneVision-7B) and forget ratios up to 15% of finetuning data. Hyperparameters such as the mask trade-off kk (default k=5k=5) are empirically validated (Zeng et al., 25 Nov 2025).

5. Benchmarking, Evaluation, and Performance

In MLLMs, selective unlearning efficacy is quantified using the S-MLLMUn Bench. Key evaluation axes include:

  • Forgetting (RfR_f, FfF_f): ROUGE-L and Fact Score for the forget set, where lower is better.
  • Retention (RrR_r, FrF_r, MrM_r): ROUGE-L, Fact, and Meaningful scores for retained and general understanding sets, where higher is better.

SMFA achieves low RfR_f (strong erasure) with minimal decreases in RrR_r and image understanding scores, outperforming GA, KL, MANU, and IDK baselines. Ablation experiments indicate both directional and magnitude-based mask criteria are required for robust trade-off. Increasing mask hyperparameter kk deepens forgetting up to a point, after which retention degrades. The forgetting occurs precisely via refusals (“I’m not the right source for that”), not arbitrary collapse. Even with few-shot anchors (Drfew=Df|\mathcal{D}_r^\text{few}| = |\mathcal{D}_f|), SMFA maintains coverage across memory and vision tasks (Zeng et al., 25 Nov 2025).

For diffusion models, unlearning is measured by output fidelity, forgetting effectiveness, and semantic integrity post-unlearning. SMFA yields improved results over previous unlearning techniques, especially in multi-concept scenarios, due to targeted weight changes and explicit alignment losses preventing drift or collapse (Li et al., 12 Apr 2025).

6. Module Interactions, Limitations, and Practical Guidance

  • Adapter–Mask Synergy: In both modalities, the “forgetting” adapter direction is reined in by the retention anchor or dynamic mask, which localizes erasure and prevents spillage into unrelated capabilities.
  • Loss Granularity: The multi-component loss in diffusion settings aligns forgotten concepts (CC) toward semantically meaningful superclasses (CsC_s), unlike naïve unlearning, which often yields degenerate or semantically empty results.
  • Regularization and Memory Locking: Knowledge distillation regularization in diffusion and hard-masked updates in MLLMs safeguard previously unlearned concepts against sequential catastrophic re-learning.
  • Hyperparameter Sensitivity: Forget vs retain balance is tunable (e.g., via kk in MLLMs, α,β\alpha, \beta in diffusion). Extreme values can induce over-forgetting or retention failures; practically, moderate values yield best trade-offs.
  • Implementation Strategies: In MLLMs, LoRA-style adapters inserted in all linear layers attain capacity-efficiency balance. In diffusion, 50% sparsity and periodic mask reallocation are empirically optimal.
  • Scalability: In both domains, SMFA’s localized adapter paradigm admits efficient post-hoc, modular updates to deployed large-scale models without end-to-end retraining.

7. Context and Relation to Broader Unlearning Research

SMFA addresses long-standing weaknesses in model unlearning: instability, residual memory, over-forgetting, and generation collapse. Compared to optimization-based global unlearning (e.g., full-model retraining, GA Difference, KL Minimization), SMFA confines weight changes spatially and semantically, ensuring targeted unlearning with bounded side effects. The two independently developed SMFA variants establish a modular, scalable template for future selective unlearning work in both generative and multimodal neural architectures (Li et al., 12 Apr 2025, Zeng et al., 25 Nov 2025).

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Sculpted Memory Forgetting Adapter (SMFA).