Document-Level Membership Inference

Updated 27 September 2025

Document-level membership inference is the process of identifying if a complete document was used in training, leveraging observable model responses.
Statistical dependencies among grouped documents can amplify attack success, undermining traditional differential privacy guarantees.
Methodologies such as threshold attacks, shadow models, and aggregated feature approaches highlight risks in sensitive, correlated data environments.

Document-level membership inference is the task of determining, given black‐box (or, in some cases, white‐box) access to a machine learning model, whether a particular document was included in the model’s training set. Accurate document-level inference is central to scientific, regulatory, and legal discussions around privacy, copyright, and proprietary data use, especially in the era of large-scale pre-trained models and generative systems. Document-level membership inference presents unique risks in domains where data exhibits strong dependencies, and where model training is applied to sensitive or legally protected corpora; the nature and degree of privacy leakage is strongly affected by the distributional structure of the data and the implementation details of both the attack and possible defenses.

1. Problem Formulation and Significance

Document-level membership inference generalizes the classical membership inference attack (MIA) paradigm, which focuses on the identification of single record membership (e.g., image or row), to aggregate, higher-structure data: entire documents, collections of documents, or all records pertaining to an entity. The attacker’s goal is to reliably decide if a document $D$ was part of the model’s training set $S$ based only on observable responses or statistics of a trained model $A$ . In sensitive contexts—medical case files, legal briefs, proprietary reports—accurate membership inference can unmask participation of individuals, presence of confidential material, or improper use of copyrighted works (Humphries et al., 2020, Meeus et al., 2023, Puerto et al., 31 Oct 2024). Practically, document-level inference presents a much greater privacy risk than single-record MIAs: inclusion or exclusion of a full document can reveal user, institutional, or author-level secrets that would not be apparent from isolated samples.

An archetypal document-level MIA scenario proceeds as follows:

Given black-box access to a trained model $A$ , the adversary aims to decide whether $D$ (a complex record) was in $A$ 's training set by exploiting any systematic differences in model behavior on $D$ versus similar but unseen documents.
Attacker capabilities may include only observable outputs (e.g., class prediction or per-token loss), partial output statistics, or, in more privileged settings, model representations.

This problem setting is especially acute in natural language processing, recommender systems, visual document QA, and federated learning applications, where correlated samples, group-level dependencies, and context-level aggregation are prevalent.

2. Data Dependencies and Breakdowns of Privacy Assumptions

A central finding in document-level membership inference is that the classical assumption of independent and identically distributed (IID) samples—a core premise for most privacy analyses and for differential privacy (DP)—frequently fails at the document level. In realistic datasets, samples share dependencies: they may originate from the same institution, participant, author, or collection, or be selected according to grouping or stratification that introduces sub-population structure.

The impact of these dependencies is substantial:

Amplification of Attack Success: When documents within a group or cluster are statistically dependent (such as all medical records from one hospital), inclusion of one member in the training set often entails inclusion of others, and their aggregate properties are amplified in the model behavior.
Underestimation of Privacy Leakage: Theoretical DP guarantees predicated on IID data (e.g., $A$ is $(\epsilon, \delta)$ -DP if $\Pr[A(S) \in \mathcal{R}] \leq e^\epsilon \Pr[A(S') \in \mathcal{R}] + \delta$ ) do not capture the group-level leakage. As shown in (Humphries et al., 2020), in non-IID regimes the effective privacy bound may degrade from $e^\epsilon$ to $e^{n\epsilon}$ , where $n$ is the correlated group size, making the guarantee vacuous for large $n$ .

Empirical evaluations in (Humphries et al., 2020) on real-world datasets with mixture dependencies, clustering, and attribute-based splits demonstrate that even off-the-shelf MIA strategies (e.g., threshold attack and shadow model attack) achieve near-perfect attack accuracy in the presence of dependencies — a scenario that is not captured by standard privacy guarantees.

Scenario	Model Generalization	MIA Advantage (IID)	MIA Advantage (non-IID)
Clustering-based split	Good	Low	High
Attribute-biased split	Good	Low	High
Hospital/region split	Good	Low	High

The conclusion is that document-level MIAs are significantly more effective in non-IID, dependency-rich data, and privacy analyses assuming independence severely underestimate risk.

3. Methodological Approaches in Document-Level Membership Inference

Document-level MIAs implement both classical and novel enhancements of the standard MIA methodology, typically following one of the strategies below.

a) Threshold/Per-Document Loss Attack:

The adversary computes a sample statistic on the document—typically average loss, prediction confidence, or a custom document likelihood metric—and compares it to a threshold. Documents with unusually low loss are deemed likely to be members, capitalizing on overfitting or increased confidence on the training set.

b) Shadow Model Attack:

An adversary trains one or more “shadow models” on auxiliary data, mimicking the architecture and training regime of the target model, but explicitly controlling the membership status of a candidate document (included vs. excluded). Outputs (loss, predicted probabilities) on the candidate document are then used to train a meta-classifier to decide membership for the target model.

c) Aggregated Feature Approaches:

Document-level attacks often require aggregation of signals over many local units. For example, segmenting a long document into sequences, obtaining per-sequence membership scores (e.g., cross-entropy, token probability), and feeding these into a meta-classifier or computing histograms of feature values (“AggFE” and “HistFE” features in (Meeus et al., 2023)).

d) Mixture Model Membership Experiments:

Document-level MIAs frequently build “mixture” distributions of data (e.g., different hospitals, regions, or attribute groups) and contrast model behaviors on these distinct components to amplify differences due to group membership.

The effectiveness of these methodologies is highly sensitive to the granularity of aggregation and the degree of statistical dependency present.

4. Differential Privacy: Protection Mechanisms and Limitations

Differential Privacy (DP) is a foundational paradigm for mitigating membership inference. However, its guarantee is grounded in the independence of samples:

$A \text{ is } (\epsilon, \delta)\text{-DP if } \forall S, S', \forall \mathcal{R}: \Pr[A(S) \in \mathcal{R}] \leq e^\epsilon \Pr[A(S') \in \mathcal{R}] + \delta$

When dependencies exist among documents—such as documents that share authorship, institution, or are derivatives of a common source—removal or modification of one member affects the others due to their correlated features. This compromises the effectiveness of DP. In extreme cases detailed by (Humphries et al., 2020), the likelihood ratio between the outputs when a group of $n$ dependent samples is modified can reach $e^{n\epsilon}$ , making DP guarantees void for even moderate $n$ .

Empirical results confirm that, in non-IID, correlated document scenarios, models trained with standard DP (e.g., DP-SGD) are highly vulnerable to MIAs that exploit group dependencies, with attack accuracy nearing 1 even for low $\epsilon$ .

Consequently, DP-based defenses for document-level MIA must account for correlation (“group privacy”) and more aggressive or group-aware noise addition, at a cost to model utility.

5. Empirical Results and Key Findings

Empirical assessments of document-level MIA performance conducted on real-world datasets (e.g., adult income, COMPAS, heart disease, hospital region splits, student school data) consistently demonstrate:

Under IID settings (random membership and non-membership), attack advantage (difference in TPR and FPR) is modest; DP often provides bounded protection.
Under dependent/clustered assignments, attack advantage rapidly increases, with accuracy approaching 1 even for sophisticated, well-generalized models.
Attribute-based or source-based cluster assignments (e.g., all records from a single hospital) lead to models that unintentionally leak membership status for entire groups, especially when group-specific features are highly distinctive.

A representative derived bound for the adversary’s advantage under IID and DP assumptions is

$\text{Adv} \leq \frac{e^\epsilon - 1 + 2\delta}{e^\epsilon + 1}$

but this guarantee does not hold under group dependencies.

Figure 1 in (Humphries et al., 2020) demonstrates the discrepancy between practical attack performance and theoretical DP bounds in non-IID experiments.

6. Implications, Open Challenges, and Pathways to Defense

Document-level membership inference, especially in the presence of data dependencies, reveals inherent limitations in prevailing privacy-preserving strategies and carries substantial practical implications:

Auditing and Compliance: Auditors and regulators can leverage document-level MIAs to assess unauthorized data inclusion, with higher detection power in group or attribute-dependent corpora.
Group Privacy Risk: Document-centric data sources (such as institution-level corpora) elevate privacy risk not only for individual contributors but for whole collectives, undermining the “per record” protection model.
Limits of Differential Privacy: Standard DP mechanisms that ignore dependencies are insufficient; robust protection requires explicit group sensitivity calibration, group-based noise addition, or data de-correlation.
Defense Mechanisms: Proposals to defend against document-level MIAs include the development of group privacy–aware DP algorithms, decorrelation or diversification of training data (through source diversification or feature modification), and more realistic MIA risk assessments that account for non-IID structure.

Prospective research directions involve:

Formalizing group-level privacy guarantees and mechanisms capable of scaling noise with correlation structure.
Developing auditing tools that operate under non-IID, document-rich settings and provide reliable attack bounds for practitioners.
Understanding the utility–privacy trade-off in applications where model performance is sensitive to document diversity and dependency-breaking interventions.

7. Conclusion

Document-level membership inference highlights the fragility of classical privacy guarantees in models trained on realistic, dependency-rich data. Findings from (Humphries et al., 2020) demonstrate that statistical dependencies (clustering, group effects, source biases) substantially elevate risk, causing adversarial advantage for off-the-shelf MIAs to grow unbounded—often far exceeding both theoretical and empirical performance in IID baselines. Differentially private algorithms lose effectiveness as a defense unless explicitly adapted for group correlations. Addressing this pervasive vulnerability requires novel strategies in both privacy accounting and model training, alongside refined auditing and data-handling procedures for document-centric high-stakes machine learning applications.