When Fairness Meets Privacy: Exploring Privacy Threats in Fair Binary Classifiers via Membership Inference Attacks (2311.03865v3)
Abstract: Previous studies have developed fairness methods for biased models that exhibit discriminatory behaviors towards specific subgroups. While these models have shown promise in achieving fair predictions, recent research has identified their potential vulnerability to score-based membership inference attacks (MIAs). In these attacks, adversaries can infer whether a particular data sample was used during training by analyzing the model's prediction scores. However, our investigations reveal that these score-based MIAs are ineffective when targeting fairness-enhanced models in binary classifications. The attack models trained to launch the MIAs degrade into simplistic threshold models, resulting in lower attack performance. Meanwhile, we observe that fairness methods often lead to prediction performance degradation for the majority subgroups of the training data. This raises the barrier to successful attacks and widens the prediction gaps between member and non-member data. Building upon these insights, we propose an efficient MIA method against fairness-enhanced models based on fairness discrepancy results (FD-MIA). It leverages the difference in the predictions from both the original and fairness-enhanced models and exploits the observed prediction gaps as attack clues. We also explore potential strategies for mitigating privacy leakages. Extensive experiments validate our findings and demonstrate the efficacy of the proposed method.
- Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pages 308–318, 2016.
- A reductions approach to fair classification. In Proceedings of the 35th International Conference on Machine Learning, volume 80, pages 60–69. PMLR, 10–15 Jul 2018.
- Uci machine learning repository, 2007.
- Membership inference attacks from first principles. In 2022 IEEE Symposium on Security and Privacy (SP), pages 1897–1914, 2022.
- Fairness with adaptive weights. In Proceedings of the 39th International Conference on Machine Learning, pages 2853–2866, 2022.
- On the privacy risks of algorithmic fairness. In 2021 IEEE European Symposium on Security and Privacy (EuroS&P), pages 292–303, 2021.
- Youssef Mroueh Ching-Yao Chuang. Fair mixup: Fairness via interpolation. In International Conference on Learning Representations, 2021.
- Label-Only Membership Inference Attacks. In International Conference on Machine Learning, pages 1964–1974, 2021.
- Flexibly fair representation learning by disentanglement. In Proceedings of the 36th International Conference on Machine Learning, pages 1436–1445, 2019.
- FairGBM: Gradient boosting with fairness constraints. In The Eleventh International Conference on Learning Representations, 2023.
- Fairness via representation neutralization. In Advances in Neural Information Processing Systems, 2021.
- Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, pages 265–284, 2006.
- Similarity Distribution Based Membership Inference Attack on Person Re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 14820–14828, 2023.
- J Geralds. Utkface large scale face dataset. github. com, 2017.
- Ffb: A fair fairness benchmark for in-processing group fairness methods, 2023.
- Equality of opportunity in supervised learning. In Advances in neural information processing systems, pages 3315–3323, 2016.
- Deep residual learning for image recognition. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016.
- Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning. In Computer Vision – ECCV 2022, pages 365–381, 2022.
- M4i: Multi-modal models membership inference. In Advances in Neural Information Processing Systems, pages 1867–1882, 2022.
- Re-weighting based group fairness regularization via classwise robust optimization. In The Eleventh International Conference on Learning Representations, 2023.
- Fairface: Face attribute dataset for balanced race, gender, and age for bias measurement and mitigation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 1548–1558, 2021.
- Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.
- Compas analysis. GitHub, available at: https://github. com/propublica/compas-analysis, 2016.
- Maskgan: Towards diverse and interactive facial image manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- Auditing Membership Leakages of Multi-Exit Networks. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pages 1917–1931, 2022.
- ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models. In USENIX Security Symposium (USENIX Security), pages 4525–4542, 2022.
- Fnnc: Achieving fairness through neural networks. In International Joint Conference on Artificial Intelligence, 2020.
- A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR), pages 1–35, 2021.
- Learning disentangled representation for fair facial attribute classification via fairness-aware information alignment. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 2403–2411, 2021.
- Fair contrastive learning for facial attribute classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10389–10398, 2022.
- On the impossibility of non-trivial accuracy in presence of fairness constraints. In Thirty-Sixth AAAI Conference on Artificial Intelligence, pages 7993–8000, 2022.
- FairVFL: A fair vertical federated learning framework with contrastive adversarial learning. In Advances in Neural Information Processing Systems, 2022.
- Sample selection for fair and robust training. In Advances in Neural Information Processing Systems, pages 815–827, 2021.
- White-box vs Black-box: Bayes Optimal Strategies for Membership Inference. In International Conference on Machine Learning, pages 5558–5567, 2019.
- Membership inference attacks against machine learning models. In 2017 IEEE symposium on security and privacy (SP), pages 3–18, 2017.
- Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2015.
- Fair scratch tickets: Finding fair sparse networks without weight training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 24406–24416, 2023.
- Fredom: Fairness domain adaptation approach to semantic scene understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Mitigating Bias in Face Recognition Using Skewness-Aware Reinforcement Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), page 10, 2020.
- Towards fairness in visual recognition: Effective strategies for bias mitigation. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8916–8925, 2020.
- Fairness-aware adversarial perturbation towards bias mitigation for deployed deep models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10379–10388, 2022.
- Purifier: Defending Data Inference Attacks via Transforming Confidence Scores. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 10871–10879, 2023.
- Enhanced membership inference attacks against machine learning models. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pages 3093–3106, 2022.
- Opacus: User-friendly differential privacy library in PyTorch. arXiv preprint arXiv:2109.12298, 2021.
- Membership Inference Attacks and Defenses in Neural Network Pruning. In 31st USENIX Security Symposium (USENIX Security 22), pages 4561–4578, 2022.
- Learning fair representations. In International Conference on Machine Learning, 2013.
- Fairness-aware contrastive learning with partially annotated sensitive attributes. In The Eleventh International Conference on Learning Representations, 2023.
- Men also like shopping: Reducing gender bias amplification using corpus-level constraints. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2979–2989, 2017.
- Learning bias-invariant representation by cross-sample mutual information minimization. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 15002–15012, 2021.
- Leveling down in computer vision: Pareto inefficiencies in fair deep classifiers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10410–10421, 2022.
- Huan Tian (5 papers)
- Guangsheng Zhang (3 papers)
- Bo Liu (484 papers)
- Tianqing Zhu (85 papers)
- Ming Ding (219 papers)
- Wanlei Zhou (63 papers)