Bits for Privacy: Evaluating Post-Training Quantization via Membership Inference

Published 17 Dec 2025 in cs.LG and cs.CR | (2512.15335v1)

Abstract: Deep neural networks are widely deployed with quantization techniques to reduce memory and computational costs by lowering the numerical precision of their parameters. While quantization alters model parameters and their outputs, existing privacy analyses primarily focus on full-precision models, leaving a gap in understanding how bit-width reduction can affect privacy leakage. We present the first systematic study of the privacy-utility relationship in post-training quantization (PTQ), a versatile family of methods that can be applied to pretrained models without further training. Using membership inference attacks as our evaluation framework, we analyze three popular PTQ algorithms-AdaRound, BRECQ, and OBC-across multiple precision levels (4-bit, 2-bit, and 1.58-bit) on CIFAR-10, CIFAR-100, and TinyImageNet datasets. Our findings consistently show that low-precision PTQs can reduce privacy leakage. In particular, lower-precision models demonstrate up to an order of magnitude reduction in membership inference vulnerability compared to their full-precision counterparts, albeit at the cost of decreased utility. Additional ablation studies on the 1.58-bit quantization level show that quantizing only the last layer at higher precision enables fine-grained control over the privacy-utility trade-off. These results offer actionable insights for practitioners to balance efficiency, utility, and privacy protection in real-world deployments.

Abstract PDF Upgrade to Chat

Summary

The paper shows that aggressive quantization, especially below 2 bits, reduces membership inference vulnerability by up to an order of magnitude compared to full precision.
It systematically evaluates leading PTQ methods (AdaRound, BRECQ, and OBC) on image datasets using state-of-the-art LiRA metrics to measure privacy leakage.
The study introduces decoupled quantization—quantizing only the last layer at higher precision—to recover accuracy while still mitigating privacy risks.

Evaluating the Privacy-Utility Trade-off of Post-Training Quantization via Membership Inference

Overview and Motivation

This paper provides a comprehensive, systematic investigation of the privacy implications of post-training quantization (PTQ) in deep neural networks, specifically analyzing how quantization at varying bit-widths influences the vulnerability of models to membership inference attacks (MIAs). The study targets leading PTQ algorithms—AdaRound, BRECQ, and OBC—evaluated across a spectrum of precisions (4-bit, 2-bit, and 1.58-bit) on canonical image classification datasets (CIFAR-10, CIFAR-100, and TinyImageNet). Privacy leakage is assessed with a rigorous attack framework, including both online and offline variants of the state-of-the-art Likelihood Ratio Attack (LiRA).

The central claim is that aggressive quantization, notably at sub-2-bit levels, offers up to an order-of-magnitude reduction in MIA vulnerability compared to full-precision networks, albeit at the expense of model accuracy. The work further investigates precision decoupling in PTQ to recover some of the lost utility while controlling privacy–utility trade-offs.

Post-Training Quantization and Membership Inference

Post-training quantization is critical for resource-constrained inference, enabling direct quantization of pretrained models by reducing parameter precision, independent of retraining. The paper investigates three representative PTQ methods:

AdaRound: Learns adaptive rounding decisions via layer-wise optimization to minimize output reconstruction error.
BRECQ: Incorporates block-wise reconstruction and uses AdaRound as a quantization primitive to mitigate inter-layer quantization artifacts.
OBC: Extends pruning-based methods to quantization by greedily minimizing quantization error through sequential parameter updates.

Membership inference attacks, and specifically LiRA, are leveraged to quantify privacy risks. LiRA employs shadow models to approximate the distribution of losses for member/non-member samples, providing a robust TPR@FPR metric for privacy evaluation.

Empirical Evaluation

Privacy and Utility Degradation with Bit-Width Reduction

Results consistently indicate that decreasing bit-width in PTQ yields substantial privacy gains, as quantized models resist LiRA attacks more effectively. This phenomenon is especially pronounced below 2 bits-per-weight. At 1.58 bits—a ternary quantization scheme inspired by recent advances in LLM quantization—[email protected]% FPR is suppressed by two orders of magnitude vis-à-vis full-precision baselines, although the concomitant utility loss is marked, especially on complex datasets such as TinyImageNet.

Figure 1: Privacy leakage of quantized models trained on CIFAR-100 under LiRA online attack; lower precision improves privacy protection, evaluated with [email protected]% FPR.

The privacy–utility trade-off is quantifiably linked to precision:

Figure 2: Test accuracy of models quantized with different PTQs at various bit-widths; accuracy significantly drops at 1.58-bit quantization.

For 4-bit quantization, most of the original accuracy is retained with negligible privacy benefit; only at 2-bit and lower do privacy improvements manifest sharply.

Figure 3: Privacy leakage of models quantized with different PTQs at various precision levels under LiRA online/offline attack; 1.58-bit consistently mitigates MIA, especially in more complex datasets.

Decoupled Quantization for Privacy-Utility Control

The paper proposes “decoupled quantization,” wherein only the last layer is quantized at higher precision (e.g., 8 bits) to recoup accuracy lost at 1.58-bit quantization, enabling fine-grained control over the privacy–utility axis. On TinyImageNet, this strategy recovers much of the original performance while maintaining substantial MIA resistance.

Figure 4: Decoupling the last layer to higher precision allows recovery of utility lost at 1.58 bits, at the cost of some privacy leakage increase.

Model-Agnostic Trends

The utility–privacy trend persists for architectures of varying depth and family, as demonstrated by applying OBC-based quantization to both ResNet18 and DenseNet121. Lower quantization tightens the privacy guarantee regardless of architectural complexity, but sensitivity to quantization can differ.

Figure 5: Utility and privacy trade-off is consistent across architectures quantized with OBC.

Variance and Robustness in Low-Bit PTQ

The analysis of 1.58-bit quantized models reveals that privacy leakage, as measured by [email protected]% FPR, can exhibit substantial variance, particularly for BRECQ. Some instantiations yield both higher accuracy and increased privacy leakage, underscoring the importance of repeated experimental runs for accurate assessment.

Figure 6: Distribution of utility and privacy for 1.58-bit quantized models; BRECQ achieves the best average but with considerable variance.

Implications and Future Directions

Practical Deployment

The findings are immediately relevant for edge AI deployment and regulated domains where privacy is paramount. System designers can employ extreme low-bit PTQ as a tunable knob, subject to utility constraints, without introducing the computational overhead and hyperparameter burden inherent to privacy-specific training approaches.

Theoretical and Methodological Extensions

This empirical demonstration motivates theoretical analysis of the mechanisms driving privacy enhancement through aggressive quantization—potentially via an information-theoretic or generalization perspective. Furthermore, combining quantization with established privacy defenses (e.g., DP-SGD or adversarial regularization) remains an open avenue for extending privacy protection.

The demonstrated generality across datasets and architectures suggests applicability beyond vision, potentially impacting privacy-aware deployment of quantized models in NLP or federated learning. Integrating quantization for activations and exploring joint training for quantization-aware privacy protection could further improve state-of-the-art techniques.

Conclusion

This work rigorously establishes that post-training quantization at sufficiently low bit-widths can offer substantial privacy leakage mitigation, as demonstrated by reduced success rates of state-of-the-art membership inference attacks, at the cost of reduced accuracy. Decoupled quantization strategies offer a promising vector to manage this trade-off for practical deployment scenarios. The results provide actionable insights for balancing efficiency, utility, and privacy in post-training quantized models, and lay the foundation for future research in quantization-driven privacy control.

Reference: "Bits for Privacy: Evaluating Post-Training Quantization via Membership Inference" (2512.15335).

Markdown

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Open Problems

We found no open problems mentioned in this paper.

Continue Learning

Authors (6)

Collections

YouTube

Show All Videos

Bits for Privacy: Evaluating Post-Training Quantization via Membership Inference

Summary

Evaluating the Privacy-Utility Trade-off of Post-Training Quantization via Membership Inference

Overview and Motivation

Post-Training Quantization and Membership Inference

Empirical Evaluation

Privacy and Utility Degradation with Bit-Width Reduction

Decoupled Quantization for Privacy-Utility Control

Model-Agnostic Trends

Variance and Robustness in Low-Bit PTQ

Implications and Future Directions

Practical Deployment

Theoretical and Methodological Extensions

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Authors (6)

Collections

YouTube

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research

Bits for Privacy: Evaluating Post-Training Quantization via Membership Inference

Summary

Evaluating the Privacy-Utility Trade-off of Post-Training Quantization via Membership Inference

Overview and Motivation

Post-Training Quantization and Membership Inference

Empirical Evaluation

Privacy and Utility Degradation with Bit-Width Reduction

Decoupled Quantization for Privacy-Utility Control

Model-Agnostic Trends

Variance and Robustness in Low-Bit PTQ

Implications and Future Directions

Practical Deployment

Theoretical and Methodological Extensions

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (6)

Collections

YouTube

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research