An Information Theoretic Approach to Machine Unlearning (2402.01401v4)

Published 2 Feb 2024 in cs.LG, cs.AI, and stat.ML

Abstract: To comply with AI and data regulations, the need to forget private or copyrighted information from trained machine learning models is increasingly important. The key challenge in unlearning is forgetting the necessary data in a timely manner, while preserving model performance. In this work, we address the zero-shot unlearning scenario, whereby an unlearning algorithm must be able to remove data given only a trained model and the data to be forgotten. We explore unlearning from an information theoretic perspective, connecting the influence of a sample to the information gain a model receives by observing it. From this, we derive a simple but principled zero-shot unlearning method based on the geometry of the model. Our approach takes the form of minimising the gradient of a learned function with respect to a small neighbourhood around a target forget point. This induces a smoothing effect, causing forgetting by moving the boundary of the classifier. We explore the intuition behind why this approach can jointly unlearn forget samples while preserving general model performance through a series of low-dimensional experiments. We perform extensive empirical evaluation of our method over a range of contemporary benchmarks, verifying that our method is competitive with state-of-the-art performance under the strict constraints of zero-shot unlearning. Code for the project can be found at https://github.com/jwf40/Information-Theoretic-Unlearning

PDF Abstract

Overview of Zero-Shot Machine Unlearning Method

Introduction

Machine unlearning is quickly becoming a crucial area of research due to expanding regulations regarding data autonomy, exemplified by policies like GDPR which allow individuals the right to request the deletion of their data from machine learning models. Conventional data deletion methods from databases do not extend to trained models, and this presents an open challenge. Existing strategies for machine unlearning, however, are not equipped to handle the zero-shot (ZS) unlearning scenario where only the model and the data to be forgotten are available without access to the original training set.

Lipschitz Regularization for Unlearning

Leveraging Lipschitz continuity, the paper introduces a method to perform machine unlearning. Specifically, the unlearning strategy involves minimizing the model's output sensitivity to perturbations of input data intended for forgetting. By applying smoothing via perturbed inputs, the presented approach facilitates the removal of specific data points from a model without compromising its generalization performance on unseen data.

Empirical Evaluation

The paper conducts an extensive empirical evaluation of its methodology across several benchmarks and modern architectures like Convolutional Neural Networks (CNNs) and Transformers. It successfully demonstrates that their approach not only achieves state-of-the-art performance in zero-shot unlearning scenarios but also does so under strict constraints – no access to the original training or retain set.

Zero-Shot Unlearning Performance

The results show significant improvements in zero-shot unlearning, particularly when compared to the prior state-of-the-art methods which require access to the training data and are constrained to full-class unlearning. The introduced method, JiT (Just in Time unlearning), extends to more realistic and challenging scenarios such as sub-class and random subset unlearning. Moreover, the method provides a pragmatic solution with a minimal addition to the unlearning runtime and computational cost.

Concluding Remarks

The presented approach represents a significant advancement in the field of machine unlearning; it manages to navigate the difficult terrain of zero-shot unlearning across various data modalities, benchmarks, and machine learning architectures. The research challenges the prevailing assumption that unlearning necessitates a retreat to training data, showing that under the right regularization framework, such as that imposed by Lipschitz continuity, one can unlearn selectively and at scale. It opens up various avenues for future exploration, including a deeper theoretical connection between Lipschitz continuity and unlearning, as well as possible extension to provide certified unlearning guarantees. While the paper does not claim certified unlearning, the empirical results suggest practical utility, especially when compliance with data deletion requests must be balanced against the preservation of model utility and the overhead costs associated with retraining.