Papers
Topics
Authors
Recent
2000 character limit reached

Conformal Unlearning Risk (CUR)

Updated 22 December 2025
  • The paper establishes a unified metric (CUR) for quantifying risks of inadequate forgetting and utility degradation in machine unlearning algorithms.
  • CUR integrates conformal prediction methods with risk-aware evaluations to balance forgetting sufficiency with utility preservation.
  • Empirical insights and theoretical guarantees demonstrate CUR's practical use in hyperparameter optimization and regulatory compliance for large-scale models.

Conformal Unlearning Risk (CUR) quantifies the maximal risk of inadequate removal or degradation in utility after the application of machine unlearning algorithms, especially in large-scale models such as LLMs. CUR unifies forgetting sufficiency and utility preservation within a single, conformal-prediction–calibrated framework, providing explicit probabilistic control over potential “leakage” or utility failure. It is underpinned by the formalism introduced in FROC and extended by conformal machine unlearning theory, and is directly related to recent developments in risk-aware evaluation, uncertainty quantification, and regulatory auditability for unlearning systems (Goh et al., 15 Dec 2025, Shi et al., 31 Jan 2025, Alkhatib et al., 5 Aug 2025).

1. Fundamental Definitions

CUR arises in conformal risk-aware machine unlearning as an operationalized, data-driven estimator:

  • Original model: θ\theta trained on dataset DtrainD_{\text{train}}.
  • Unlearned model: θ=Unlearn(θ;λ)\theta' = \text{Unlearn}(\theta; \lambda) for a configuration or strategy λ\lambda.
  • Forget set and retain set: DforgetD_{\text{forget}} (data to be erased) and DretainD_{\text{retain}} (utility set).
  • Calibration/reference set: D^ref={(xi,yi)}i=1Nref\widehat{\mathcal D}_{\mathrm{ref}} = \{(x_i, y_i)\}_{i=1}^{N_{\mathrm{ref}}}, disjoint from DforgetD_{\text{forget}}.
  • User controls: Risk tolerance δ\delta (allowable fraction of ‘failures’) and per-sample risk threshold α\alpha.
  • CUR: The maximal probability that, post-unlearning, the model fails to forget sufficiently or degrades utility, under conformal calibration.

2. Risk Model and Mathematical Formalism

CUR construction involves two primary sub-risks:

  • Forgetting deficiency, penalizing the model for not erasing DforgetD_{\text{forget}} signal,
  • Utility degradation, capturing loss of accuracy or drift on DretainD_{\text{retain}}.

Split Conformal Risk Control

The unified risk control is enforced as: P(x,y)D[R(pθ(x),y)>α]δ\mathbb{P}_{(x,y)\sim\mathcal{D}}\bigl[R(p_{\theta'}(x), y) > \alpha\bigr] \leq \delta where RR is the configuration-level risk statistic.

Continuous Risk Statistic

For configuration λ\lambda,

  • Forgetting-shift statistic:

s(λ)=log(LossU(λ))+(maxλAccU(λ)AccU(λ))s^{(\lambda)} = \log\bigl(\mathrm{Loss}_U(\lambda)\bigr) + \bigl(\max_{\lambda'} \mathrm{Acc}_U(\lambda') - \mathrm{Acc}_U(\lambda)\bigr)

  • Forgetting deficiency penalty:

Δf(λ)=softplus(α^unlearns(λ))\Delta_f(\lambda) = \mathrm{softplus}(\hat\alpha_{\mathrm{unlearn}} - s^{(\lambda)})

  • Retain-distortion statistic:

r(λ)=log(LossR(λ))log(minλLossR(λ))+(maxλAccR(λ)AccR(λ))r^{(\lambda)} = \log\bigl(\mathrm{Loss}_R(\lambda)\bigr) - \log(\min_{\lambda'} \mathrm{Loss}_R(\lambda')) + \bigl(\max_{\lambda'} \mathrm{Acc}_R(\lambda') - \mathrm{Acc}_R(\lambda)\bigr)

  • Utility penalty:

Δu(λ)=softplus(r(λ))\Delta_u(\lambda) = \mathrm{softplus}(r^{(\lambda)})

  • Unified per-configuration risk:

R~(λ)=wfΔf(λ)+wuΔu(λ),(wf=wu=1)\widetilde R(\lambda) = w_f \Delta_f(\lambda) + w_u \Delta_u(\lambda), \quad (w_f = w_u = 1)

Aggregate unlearning risk (across calibration set): R^θ(D^ref)=1Nrefi=1NrefR~(λ;xi,yi)\widehat{R}_{\theta'}(\widehat{\mathcal D}_{\mathrm{ref}}) = \frac{1}{N_{\mathrm{ref}}} \sum_{i=1}^{N_{\mathrm{ref}}} \widetilde R(\lambda; x_i, y_i)

Conformal Calibration

CUR is operationalized via

α^unlearn=min{h1(ln(1/δ)Nref,R^),  Φbin1(δe;Nref,R^)}\hat\alpha_{\mathrm{unlearn}} = \min\bigg\{ h^{-1}\left(\frac{\ln(1/\delta)}{N_{\mathrm{ref}}}, \widehat{R}\right),\; \Phi^{-1}_{\mathrm{bin}}\left(\frac{\delta}{e}; N_{\mathrm{ref}}, \widehat{R}\right) \bigg\}

with h(a,b)=alnab+(1a)ln1a1bh(a, b)=a\ln\tfrac{a}{b}+(1-a)\ln\tfrac{1-a}{1-b} and Φbin1(q;N,p)\Phi^{-1}_{\mathrm{bin}}(q;N,p) the binomial inverse CDF.

3. Algorithmic Workflow

CUR defines a precise pipeline for risk-certified unlearning hyperparameter selection and model assessment:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Input:
    θ: Pretrained model
    Λ: Set of candidate unlearning configurations {λ, ..., λ_K}
    ˆD_ref: Reference calibration set of size N_ref
    δ: Risk budget
    α: Per-example threshold
Output:
    ˆΛ_α: Valid configuration set, Lookup dictionary λ  α̂_unlearn(λ)

for λ in Λ:
    θ  Unlearn(θ; λ)
    For each (x_i, y_i) in ˆD_ref compute R_i  w_f Δ_f(λ; x_i, y_i) + w_u Δ_u(λ; x_i, y_i)
    ˆR  (1/N_ref)  R_i
    α̂  min { h^{-1}(ln(1/δ)/N_ref, ˆR), Φ¹_bin(δ/e; N_ref, ˆR) }
    Store Lookup[λ]  α̂
    If α̂  α, add λ to ˆΛ_α
return ˆΛ_α, Lookup
Multiple-testing control is achieved via Bonferroni (setting δ → δ/K).

4. Conformal Unlearning Risk in Conformal Prediction Frameworks

Alternative but compatible definitions of CUR are given in (Shi et al., 31 Jan 2025) and (Alkhatib et al., 5 Aug 2025), where it is derived from set-coverage/anti-coverage properties in split conformal prediction:

  • Coverage (on DD): 1D(x,y)D1[yC(x)]\frac{1}{|D|} \sum_{(x,y)\in D} \mathbf{1}[y\in C(x)]
  • Set size: 1D(x,y)DC(x)\frac{1}{|D|} \sum_{(x,y)\in D} |C(x)|
  • Conformal Ratio (CR): Coverage(D)SetSize(D)\frac{\mathrm{Coverage}(D)}{\mathrm{SetSize}(D)}
  • Efficiently Covered Frequency (ECF): fraction of retained points covered by conformal set of size c\leq c
  • Efficiently Uncovered Frequency (EuCF): fraction of forgotten points excluded from conformal set of size d\leq d

CUR is then given by

CUR(c,d)=max{1ECFc,1EuCFd}\mathrm{CUR}(c, d) = \max\{1 - \mathrm{ECF}_c,\, 1 - \mathrm{EuCF}_d\}

This metric summarizes the “worst-case” probability that forgetting or retention fails under specified set-size constraints (Alkhatib et al., 5 Aug 2025).

5. Empirical Insights and Trade-Offs

Extensive empirical evaluation on LLMs and image models demonstrates:

  • Risk landscapes: R~\widetilde R strongly anti-correlates with forget-set accuracy, while retain-set accuracy decays more slowly as risk increases. The CUR surface exposes monotonic trade-offs, visualized as heatmaps and curves that guide hyperparameter selection (Goh et al., 15 Dec 2025).
  • Configuration validity: Only unlearning configurations with α^unlearnα\hat\alpha_{\mathrm{unlearn}} \leq \alpha satisfy the risk constraint—this set shrinks as distributional shift (Hellinger radius, ρ\rho) grows, and as more aggressive unlearning is attempted.
  • Reference set and learning rate: Larger calibration set size makes risk evaluation stricter (higher R~\widetilde R), while increasing learning rate hastens forgetting but also induces utility drift; both are detected quantitatively by rising CUR values.
  • Model and method dependence: No single unlearning method or architecture is dominant. For example, GA+Descent is generally preferable for LLaMA3.1-8B and AmberChat, while RedPajama-7B achieves minimal CUR under GA+KL. This necessitates model-adaptive unlearning and risk-driven method selection (Goh et al., 15 Dec 2025).

6. Theoretical Guarantees and Interpretability

The following statistical properties hold:

  • Finite-sample conformal bounds: With probability at least 1δ1-\delta (over draws of the calibration set), the true risk of per-example failure does not exceed α^unlearn\hat\alpha_{\mathrm{unlearn}} or the specified target. This is robust to model and calibration size variation (Goh et al., 15 Dec 2025, Alkhatib et al., 5 Aug 2025).
  • Family-wise control: Bonferroni correction ensures simultaneous risk control over all KK tried configurations with total failure rate δ\leq\delta.
  • Monotonicity: The unified risk R~\widetilde R is non-decreasing in both forgetting deficiency and utility degradation; thus the practitioner uses a single “knob” to trade forgetting and utility loss.
  • No retraining required: Conformal Unlearning Risk admits computation and calibration directly on the unlearned model; there is no dependence on retrained-from-scratch baselines (Alkhatib et al., 5 Aug 2025).

7. Practical Implications and Regulatory Utility

CUR offers an actionable, auditable, and unified metric for post-unlearning certification:

  • Unified reporting: CUR enables reporting a single, calibrated leakage/failure rate aggregated over both retention and forgetting objectives.
  • Regulatory transparency: Auditors can be provided with CUR values at chosen coverage levels, directly linking statistical risk guarantees to privacy policy and compliance (Alkhatib et al., 5 Aug 2025).
  • Hyperparameter optimization: CUR is used to guide and halt unlearning procedures, accepting only those configurations that stay below risk tolerances.
  • Generalization across modalities: While originally devised for LLMs (Goh et al., 15 Dec 2025), CUR concepts are directly applicable to image models and broader machine unlearning contexts (Shi et al., 31 Jan 2025, Alkhatib et al., 5 Aug 2025).

In summary, Conformal Unlearning Risk (CUR) operationalizes a statistically principled approach to controlling, certifying, and auditing the risk of incomplete forgetting and loss of utility in machine unlearning frameworks. The conformal calibration and continuous risk modeling embedded in CUR provide tight, interpretable, and practical guarantees crucial for model deployment, risk-sensitive applications, and regulatory compliance (Goh et al., 15 Dec 2025, Shi et al., 31 Jan 2025, Alkhatib et al., 5 Aug 2025).

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Conformal Unlearning Risk (CUR).