Optimal Erasure Samples (OES)

Updated 24 November 2025

Optimal Erasure Samples (OES) are strategies for selecting or modifying data samples to optimize performance and resilience under erasure across diverse domains.
OES methods leverage innovative algorithms like SSSE in model unlearning and threshold scheduling in remote estimation, achieving near-optimal performance with reduced computational costs.
In frame theory and coding, OES techniques minimize reconstruction error and balance error/erasure trade-offs, while thermodynamic protocols optimize energy dissipation based on system dynamics.

Optimal Erasure Samples (OES) refer to the selection or modification of data samples, codes, or structures in order to optimize resilience or performance under erasure, whether for statistical inference, coding, machine learning unlearning, frame theory, or thermodynamic information erasure. OES arises in several research paradigms, including model unlearning, stochastic control of erasure-prone channels, algebraic coding with erasure options, and frame-theoretic optimal reconstruction.

1. OES in Model Unlearning and Machine Learning

In the context of machine unlearning, the OES problem is defined as follows: given a learned model with parameters $w^*$ trained on dataset $D = \{(x_i, y_i)\}_{i=1}^n$ , and a subset $S\subset D$ to be "erased," compute new parameters $w^*_{-S}$ as if $S$ were never included. The gold standard is full retraining on $D\setminus S$ , but this is computationally impractical for large models or when $S$ is small.

Peste et al. introduce the SSSE (Single-Step Sample Erasure) algorithm, which approximates the optimal $w^*_{-S}$ via a Newton-type update: $w_{\mathrm{new}} = w^{*} + \frac{\epsilon}{n-k}\widetilde{F}^{-1} g_e$ where $g_e = \sum_{(x, y)\in S} \nabla \ell(w^{*}; x, y)$ is the erasure gradient, $\widetilde{F}$ is a damped block-diagonal Fisher matrix, $\epsilon$ is a scale parameter, and $k=|S|$ . This yields a single-step parameter update requiring only gradients on $S$ and a precomputed matrix inverse, making erasure orders of magnitude cheaper than retraining (Peste et al., 2021).

Empirical results demonstrate that SSSE nearly matches the performance of retraining on several benchmarks such as CelebA, AwA2, and CIFAR10, typically incurring only a 1–2% accuracy gap. Accuracy depends on the convexity of the objective, the smoothness of the loss landscape, and the relative size of $k$ to $n$ .

2. OES in Remote Estimation and Stochastic Control

In remote estimation, the OES problem involves sampling multiple Ornstein–Uhlenbeck (OU) processes over an erasure channel to minimize long-term average sum mean-square error (MSE), subject to a sampling-frequency budget and erasures. The processes are sampled and their measurements sent to a remote estimator, where sampled data can be erased independently with probability $\epsilon$ .

For both Maximum-Age-First (MAF, with feedback) and Round-Robin (RR, without feedback) scheduling, the provably optimal policy is a stationary threshold rule: sample or wait according to whether the accumulated service time since the last successful transmission exceeds a threshold $\tau^*$ .

Formally, the optimal threshold is given by: $\tau^* = \max\{ G^{-1}(\beta^*), H^{-1}(W_\min) \}$ where $G(\cdot)$ captures the marginal MSE-reduction-to-wait trade-off, $H(\cdot)$ quantifies expected waiting, $\beta^*$ is the minimum achievable sum-MSE, and $W_{\min}$ encodes the sample rate constraint. Expressions for $G$ , $H$ , and $\beta^*$ are provided in closed form, involving problem parameters $K$ (number of processes), $f_{\max}$ (sampling budget), $\epsilon$ (erasure probability), $\theta_k$ , $\sigma_k$ , and $\mu$ (service rate) (Banawan et al., 2022, Banawan et al., 2023).

Key qualitative dependencies include:

$\tau^*$ increases with $\epsilon$ for MAF (feedback), decreases with $\epsilon$ for RR (no feedback).
$\tau^*$ strictly increases with $K$ .
Minimum sum-MSE $J^*$ is strictly increasing in $\epsilon$ and decreasing in $f_{\max}$ .

3. OES in Frame Theory and Signal Reconstruction

In finite frame theory, OES refers to the design of frames and dual frames (or pairs) to minimize maximal reconstruction error after erasure of coefficients. The error operator is characterized under several functionals: Frobenius norm, spectral radius, and numerical radius.

For a given frame $F = \{f_i\}$ and dual $G = \{g_i\}$ , with erasure pattern $I$ ,

$E_I^{(F,G)} = -\sum_{i\in I} f_i \otimes g_i.$

The optimal choice of $(F,G)$ depends on the error measure:

Frobenius-optimality: Achieved by 1-uniform (for $r=1$ erasure) and 2-uniform (for $r=2$ ) dual pairs; for tight and equiangular frames the canonical dual yields optimal OES for all $r$ .
Spectral/Numerical-radius-optimality: Similarly characterized by $1$- and $2$-uniformity (Arati et al., 2024, Mondal, 2022, Mondal, 2022).
PASOD-frames/pairs: Optimize convex combinations of spectral radius and operator norm under probabilistic erasure, yielding a convex, closed, compact set of solutions; uniqueness is characterized in terms of constant weight-norms or frame tightness (Mondal, 2022).

An explicit algorithm for constructing optimal duals involves solving a constrained convex problem over dual parameters. For example, for one-erasure and $p = 1/2$ , the objective is

$\Phi(\{u_i\}) = \max_{i} q_i \left(|\langle f_i, S_F^{-1}f_i + u_i\rangle| + \|f_i\| \|S_F^{-1}f_i + u_i\| \right)$

subject to $\sum f_i \otimes u_i = 0$ .

4. OES in Coding Theory: Error/Erasure Decoding

In algebraic coding (notably Reed–Solomon codes), OES addresses the optimal selection of erasure patterns (positions to erase) in multi-trial (GMD-style) decoding to minimize the residual codeword error. The Guruswami–Sudan (GS) list decoder and Bounded Minimum Distance (BMD) decoders allow errors and erasures with a trade-off $\lambda$ (errors cost $\lambda$ times as much in the distance metric).

The optimal thresholds $\{\tau_k^*\}$ for $z$ decoder trials are: $\tau_k^* = \frac{E_0(R)}{s} \cdot \frac{2 (1/(\lambda-1))^{k-1} - \lambda}{2 (1/(\lambda-1))^{z-1} - \lambda}, \quad (k = 1,\dots,z)$ where $E_0(R)$ is the Gallager exponent and $s$ the optimizing parameter from the large deviations analysis. This schedule minimizes the residual error probability exponentially in $n$ : $P_e \approx \exp\left[ -2 E_0(R) \delta \cdot \frac{(1/(\lambda-1))^z - 1}{2 (1/(\lambda-1))^{z} - \lambda} n \right]$ The OES procedure thus efficiently balances computational cost (decoding trials) and reliability (Senger et al., 2011).

5. OES in Thermodynamic Information Erasure

For the physical erasure of a classical bit within a finite time, the OES protocol minimizes average dissipated heat for given process duration $\tau$ and desired final state. The minimal dissipation above Landauer’s bound scales as the square of the Hellinger distance between initial and final distributions, divided by $\tau$ : $\langle Q \rangle_{\mathrm{opt}} \approx k_B T \ln 2 + \frac{k_B T}{\tau} D_H^2$ with $D_H$ the Hellinger distance. The exact OES protocol can be constructed by solving Hamiltonian equations for the occupation probabilities and inverting to obtain the time-dependent control parameters (barrier height and well tilt) (Zulkowski et al., 2013).

6. Uniqueness, Construction, and Theoretical Guarantees

For frame-theoretic OES:

The canonical dual is uniquely optimal under 1-uniformity (constant-weighted norm condition) (Arati et al., 2024).
For higher-order erasures, equiangular tight frames jointly with their canonical duals achieve simultaneous optimality under Frobenius, spectral and numerical radii.
The set of all optimal duals for a given frame and error criterion is convex, closed, and compact, and unitary-invariant (Mondal, 2022).

In model unlearning OES, accuracy is theoretically guaranteed under convexity, small $k/n$ , and smooth landscapes. For coding-theoretic OES, thresholds are explicitly optimal under the aforementioned models.

7. Practical Implications and Limitations

OES solutions support demands for privacy, efficient model editing, robust coding, and minimal-dissipation operations. For ML unlearning, SSSE enables practical, near-optimal removal of data—crucial for compliance with data minimization and privacy policies. For remote estimation, OES policies yield tractable sensor scheduling rules robust to erasure. For coding, OES thresholds optimize error/erasure trade-offs given decoder and channel parameters.

Limitations exist when assumptions (convexity, small $k$ , regularity) break down; otherwise, OES constructions provide universal robustness under erasure-constrained scenarios.

References: