ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models (1806.01246v2)

Published 4 Jun 2018 in cs.CR, cs.AI, and cs.LG

Abstract: Machine learning (ML) has become a core component of many real-world applications and training data is a key factor that drives current progress. This huge success has led Internet companies to deploy machine learning as a service (MLaaS). Recently, the first membership inference attack has shown that extraction of information on the training set is possible in such MLaaS settings, which has severe security and privacy implications. However, the early demonstrations of the feasibility of such attacks have many assumptions on the adversary, such as using multiple so-called shadow models, knowledge of the target model structure, and having a dataset from the same distribution as the target model's training data. We relax all these key assumptions, thereby showing that such attacks are very broadly applicable at low cost and thereby pose a more severe risk than previously thought. We present the most comprehensive study so far on this emerging and developing threat using eight diverse datasets which show the viability of the proposed attacks across domains. In addition, we propose the first effective defense mechanisms against such broader class of membership inference attacks that maintain a high level of utility of the ML model.

Authors (6)

Ahmed Salem (35 papers)
Yang Zhang (1129 papers)
Mathias Humbert (19 papers)
Pascal Berrang (10 papers)
Mario Fritz (160 papers)
Michael Backes (157 papers)

Citations (852)

View on Semantic Scholar

Summary

An Analysis of "ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models"

The paper "ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models" addresses the security and privacy concerns inherent in the deployment of machine learning as a service (MLaaS). The authors systematically relax previously stringent assumptions about adversaries in membership inference attacks, demonstrating that these attacks can be broadly applied with fewer dependencies. Furthermore, they propose effective defense mechanisms to mitigate these risks while maintaining high model utility.

Overview of Key Contributions

The paper makes several noteworthy contributions:

Relaxation of Adversarial Assumptions: The authors critique the assumptions made by prior membership inference attacks, such as the necessity for multiple shadow models and detailed knowledge of the target model structure and training data distribution. They introduce three adversary scenarios with progressively relaxed assumptions, leading to more broadly applicable attack methodologies.
Extensive Experimental Validation: The attacks proposed in the paper are validated using a comprehensive suite of eight datasets spanning various domains. This empirical approach underscores the generalizability and robustness of the attacks across different settings.
Proposal of Novel Defense Mechanisms: To counteract membership inference attacks, the paper introduces two defense strategies, namely dropout and model stacking. These methods aim to reduce the overfitting of ML models, a key factor contributing to the success of membership inference attacks.

Detailed Examination of Adversarial Scenarios

Adversary 1: Simplified Shadow Model Design

The first adversary reduces the dependence on multiple shadow models. The attack is performed using just one shadow model, which is significantly cost-effective in MLaaS settings. The attack strategy involves training this shadow model on data from the same distribution as the target model’s training data but relaxing the need to mimic the exact structure of the target model. Experimental results show that this simplified approach achieves a precision of 0.95 and recall of 0.95 on the CIFAR-100 dataset, comparable to previous attacks utilizing multiple shadow and attack models.

Adversary 2: Data Transferring Attack

The second adversary further relaxes assumptions by not requiring data from the same distribution as the target model’s training data. Instead, existing datasets from different domains are used to train shadow models. This attack applies a data-transferring methodology, shown to be highly effective. For instance, a shadow model trained on the 20 Newsgroups dataset achieved a precision of 0.94 and recall of 0.93 when attacking a target model trained on the CIFAR-100 dataset.

Adversary 3: Model and Data Independence

The third adversary does not require any shadow model training. The attack is purely based on the posterior probabilities output by the target model and utilizes statistical measures such as the maximum posterior. This unsupervised attack achieves strong performance across datasets. For example, on the CIFAR-100 dataset, the attack achieved a notable AUC value, underscoring its effectiveness even without shadow models.

Defense Mechanisms

Dropout

The paper proposes dropout, a technique widely used to prevent overfitting in neural networks, as a defense against membership inference attacks. Empirical evaluations show dropout significantly reduces the performance of membership inference attacks. For instance, using a dropout ratio of 0.5 reduced the precision and recall of adversary 1 attacking a CNN trained on CIFAR-100 by over 30%.

Model Stacking

For non-neural network models, the paper suggests model stacking, employing multiple ML models in a hierarchical structure to prevent overfitting. This technique similarly reduced attack performance notably while maintaining the model's predictive accuracy. For instance, precision and recall for adversary 1 attacking a model trained on CIFAR-10 dropped by more than 30%.

Theoretical Implications and Future Directions

The research highlights the inherent risks of overfitted ML models to membership inference attacks and provides empirically validated methods to mitigate these risks. Future research could explore further optimization of these defense mechanisms and investigate their application to other forms of model extraction and adversarial attacks. The broader applicability of the refined adversarial models also suggests a need for continuous evaluation of emerging ML models and adaptation of defense mechanisms.

By fundamentally challenging and expanding the assumptions underpinning membership inference attacks, this paper provides critical insights into the robustness of machine learning models and offers practical solutions to enhance their privacy and security.

PDF Markdown

Related Papers

Tweets

https://twitter.com/briandcolwell/status/1918005969986175363

https://twitter.com/briandcolwell/status/1918006045907259647