Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations (2003.02960v3)

Published 5 Mar 2020 in cs.LG, cs.CV, cs.IT, math.IT, and stat.ML

Abstract: We describe a procedure for removing dependency on a cohort of training data from a trained deep network that improves upon and generalizes previous methods to different readout functions and can be extended to ensure forgetting in the activations of the network. We introduce a new bound on how much information can be extracted per query about the forgotten cohort from a black-box network for which only the input-output behavior is observed. The proposed forgetting procedure has a deterministic part derived from the differential equations of a linearized version of the model, and a stochastic part that ensures information destruction by adding noise tailored to the geometry of the loss landscape. We exploit the connections between the activation and weight dynamics of a DNN inspired by Neural Tangent Kernels to compute the information in the activations.

Citations (170)

View on Semantic Scholar

Summary

The paper introduces an NTK-inspired scrubbing framework that uses differential equations and noise injection to erase specific training data influence.
It presents a novel mutual information bound that quantifies leakage, showing significantly reduced vulnerabilities in black-box attack settings.
Empirical results on architectures like ResNet-18 using datasets such as CIFAR-10 confirm that the method matches or outperforms previous data removal techniques.

An Examination of Techniques for Effective Data Forgetting in Deep Networks

The paper by Golatkar, Achille, and Soatto introduces an advanced approach to data removal in deep neural networks (DNNs), focusing on the problem of scrubbing networks of dependency on specific training data—referred to as the cohort to forget. This work expands upon existing methods by proposing a novel information-theoretic framework to address both white-box and black-box attacks, where the attacker has varying levels of access to the model's parameters or input-output pairs, respectively.

A cornerstone of the proposed approach is the introduction of a new bound that quantifies the information an attacker can extract concerning forgotten data, particularly under black-box scenarios. By employing differential equations rooted in the linearization of models, coupled with noise injection informed by the loss landscape's geometry, the authors suggest a deterministic and stochastic data scrubbing method. This method considers the dynamics of neural tangent kernels (NTK) to analyze and ensure information removal from the final activations.

Methodological Insights

The core methodological contribution includes the NTK-inspired forgetting procedure, which provides valuable insights into the weight dynamics associated with over-parameterized networks. In these models, information can reside in the null space without visibly altering the final activations. The scrubbing process, conceptualized using NTK, is capable of systematically reducing this informational footprint more effectively than previous approaches.

The paper presents a refined approach to quantifying data removal efficacy through a locally-applicable bound on mutual information. This bound takes into account the stochastic nature of the training algorithm, establishing a relationship between the position of weights in parameter space and their null-space behavior. Notably, the authors achieve a rigorous demonstration that suggests black-box attack scenarios can yield much lower information leakage than white-box scenarios, especially when the number of queries is constrained.

Practical Implementation

The authors pursue empirical validation using various readout functions on standard datasets such as CIFAR-10 and Lacuna-10, covering diverse network architectures like All-CNN and ResNet-18. Key performance metrics include conventional error measurements across different data subsets (retain vs. forget), bespoke membership inference attacks, and the time required for relearning the forget cohort. Across these metrics, the NTK-based method consistently matches or outperforms prior methodologies, highlighting its potential utility in real-world scenarios requiring rapid and definitive data scrubbing.

Theoretical Implications

The implications of this research are manifold. The NTK framework aids in isolating specific learning components relevant in the forget-cohort context, emphasizing the role of parameter-space null regions. This characterization can refine understanding of model inversion limits, membership inference risks, and scrubbing efficacy—potentially informing the development of robust privacy-preservative machine learning protocols.

Additionally, the work underscores the computational challenges associated with calculating extensive NTK projections, pointing to a need for more efficient algorithms for large-scale implementations. Yet, the potential scalability through incremental calculation methods offers a promising avenue for future refinement.

Future Directions

Looking forward, further exploration is warranted on generalizing NTK-based procedures to a broader set of training dynamics beyond fine-tuning scenarios, along with enhancing defenses against adversarial query crafting in black-box attack models. There's also a practical interest in fully automated pipelines for data compliance adherence in light of evolving legal standards concerning data privacy and the right to be forgotten.

In conclusion, this paper makes a significant addition to ongoing conversations about trustworthy AI, specifically concerning accountable model behavior post data-deletion requests. While robust in theoretical inquiry and empirical validation, its potential lays the foundation for large-scale practical implementations ensuring informational sanctity in AI systems.

PDF Markdown

Related Papers

Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks (2019)
Boundary Unlearning (2023)
Towards Unbounded Machine Unlearning (2023)
Fortuitous Forgetting in Connectionist Networks (2022)
Amnesiac Machine Learning (2020)