Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks (2203.03929v2)

Published 8 Mar 2022 in cs.LG, cs.AI, and cs.CR

Abstract: The wide adoption and application of Masked LLMs~(MLMs) on sensitive data (from legal to medical) necessitates a thorough quantitative investigation into their privacy vulnerabilities -- to what extent do MLMs leak information about their training data? Prior attempts at measuring leakage of MLMs via membership inference attacks have been inconclusive, implying the potential robustness of MLMs to privacy attacks. In this work, we posit that prior attempts were inconclusive because they based their attack solely on the MLM's model score. We devise a stronger membership inference attack based on likelihood ratio hypothesis testing that involves an additional reference MLM to more accurately quantify the privacy risks of memorization in MLMs. We show that masked LLMs are extremely susceptible to likelihood ratio membership inference attacks: Our empirical results, on models trained on medical notes, show that our attack improves the AUC of prior membership inference attacks from 0.66 to an alarmingly high 0.90 level, with a significant improvement in the low-error region: at 1% false positive rate, our attack is 51X more powerful than prior work.

Authors (5)

Fatemehsadat Mireshghallah (26 papers)
Kartik Goyal (10 papers)
Archit Uniyal (3 papers)
Taylor Berg-Kirkpatrick (106 papers)
Reza Shokri (46 papers)

Citations (130)

View on Semantic Scholar

Summary

Quantifying Privacy Risks of Masked LLMs Using Membership Inference Attacks

This paper investigates the capability of masked LLMs (MLMs) to memorize and inadvertently reveal sensitive information from their training data. The focus is on the susceptibility of these models to membership inference attacks (MIAs), which aim to deduce whether a specific data sample was part of the model's training dataset.

Core Contributions

The authors identify that prior attempts at quantifying privacy risks in MLMs have been inconclusive. Previous studies primarily relied on the model's loss function to determine membership, potentially underestimating the vulnerability of these models. The research presented introduces a more robust approach by incorporating likelihood ratio hypothesis testing. This method uses an additional reference model to enhance the predictions of MIAs by accounting for the intrinsic complexity of each data sample.

Methodology

The proposed method involves the following:

Likelihood Ratio Test: The authors leverage a likelihood ratio test by comparing the likelihood of the target sample being generated by the MLM against a reference model trained on the same distribution without including the sample. This ratio aims to eliminate variations caused by sample complexity.
Energy-Based Models: To execute this strategy, MLMs, typically not probabilistic by nature, are viewed as energy-based models where likelihood calculations are feasible. This formulation allows the authors to conduct effective membership inference attacks.

The experimental setup focuses on ClinicalBERT models trained on sensitive medical datasets (MIMIC-III), with performance evaluations made against medical datasets from i2b2. The reference model is a domain-specific BERT variant trained on PubMed data to reflect the general training data distribution without using data from the target model's training set.

Results

The paper reports a significant improvement in the Area Under the Receiver Operating Characteristic Curve (AUC) when employing the proposed likelihood ratio test. For instance, the AUC improved from 0.66 to 0.90 when dealing with ClinicalBERT models trained on medical datasets. Furthermore, in highly stringent false positive scenarios, the likelihood ratio method outperformed previous baselines by a magnitude of up to 51 times. The results indicate that MLMs, contrary to earlier conclusions, are profoundly susceptible to MIAs, especially when subjected to robust attack methodologies like the one proposed.

Implications and Future Directions

Highlighting the inherent privacy risks within MLMs, the paper underscores the necessity for developing robust privacy-preserving training mechanisms. Given the paper's conclusive evidence on the susceptibility of MLMs to privacy breaches, future research must focus on integrating differential privacy techniques and robust anonymization strategies in the training pipeline. Moreover, there is a need for comprehensive auditing frameworks to continuously assess and mitigate privacy risks as LLMs grow in size and application.

By improving our understanding of the privacy vulnerabilities inherent in LLMs, this research lays the foundation for developing safer and more reliable NLP systems, particularly in domains handling sensitive personal information such as healthcare and finance.

Related Papers

Find Related Papers

Tweets

https://twitter.com/niloofar_mire/status/1747000592390889949

YouTube

Show All Videos