Privacy for Fairness: Information Obfuscation for Fair Representation Learning with Local Differential Privacy (2402.10473v1)
Abstract: As ML becomes more prevalent in human-centric applications, there is a growing emphasis on algorithmic fairness and privacy protection. While previous research has explored these areas as separate objectives, there is a growing recognition of the complex relationship between privacy and fairness. However, previous works have primarily focused on examining the interplay between privacy and fairness through empirical investigations, with limited attention given to theoretical exploration. This study aims to bridge this gap by introducing a theoretical framework that enables a comprehensive examination of their interrelation. We shall develop and analyze an information bottleneck (IB) based information obfuscation method with local differential privacy (LDP) for fair representation learning. In contrast to many empirical studies on fairness in ML, we show that the incorporation of LDP randomizers during the encoding process can enhance the fairness of the learned representation. Our analysis will demonstrate that the disclosure of sensitive information is constrained by the privacy budget of the LDP randomizer, thereby enabling the optimization process within the IB framework to effectively suppress sensitive information while preserving the desired utility through obfuscation. Based on the proposed method, we further develop a variational representation encoding approach that simultaneously achieves fairness and LDP. Our variational encoding approach offers practical advantages. It is trained using a non-adversarial method and does not require the introduction of any variational prior. Extensive experiments will be presented to validate our theoretical results and demonstrate the ability of our proposed approach to achieve both LDP and fairness while preserving adequate utility.
- Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security (2016), pp. 308–318.
- Deep variational information bottleneck. arXiv preprint arXiv:1612.00410 (2016).
- Beyond adult and compas: Fair multi-class prediction via information projection. Advances in Neural Information Processing Systems 35 (2022), 38747–38760.
- Learning representations for neural network-based classification using the information bottleneck principle. IEEE transactions on pattern analysis and machine intelligence 42, 9 (2019), 2225–2239.
- Variational leakage: The role of information complexity in privacy leakage. In Proceedings of the 3rd ACM Workshop on Wireless Security and Machine Learning (2021), pp. 91–96.
- Differential privacy has disparate impact on model accuracy. Advances in neural information processing systems 32 (2019).
- Mutual information neural estimation. In International conference on machine learning (2018), PMLR, pp. 531–540.
- Adversarially learned representations for information obfuscation and inference. In International Conference on Machine Learning (2019), PMLR, pp. 614–623.
- Optimized pre-processing for discrimination prevention. Advances in neural information processing systems 30 (2017).
- Fundamental limits of perfect privacy. In 2015 IEEE International Symposium on Information Theory (ISIT) (2015), IEEE, pp. 1796–1800.
- On the privacy risks of algorithmic fairness. In 2021 IEEE European Symposium on Security and Privacy (EuroS&P) (2021), IEEE, pp. 292–303.
- Club: A contrastive log-ratio upper bound of mutual information. In International conference on machine learning (2020), PMLR, pp. 1779–1788.
- Privacy at scale: Local differential privacy in practice. In Proceedings of the 2018 International Conference on Management of Data (2018), pp. 1655–1658.
- Flexibly fair representation learning by disentanglement. In International conference on machine learning (2019), PMLR, pp. 1436–1445.
- Compressing neural networks using the variational information bottleneck. In International Conference on Machine Learning (2018), PMLR, pp. 1135–1144.
- Compas risk scales: Demonstrating accuracy equity and predictive parity. Northpointe Inc 7, 7.4 (2016), 1.
- Collecting telemetry data privately. Advances in Neural Information Processing Systems 30 (2017).
- Privacy against statistical inference. In 2012 50th annual Allerton conference on communication, control, and computing (Allerton) (2012), IEEE, pp. 1401–1408.
- UCI machine learning repository, 2017.
- Dwork, C. A firm foundation for private data analysis. Communications of the ACM 54, 1 (2011), 86–95.
- Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference (2012), pp. 214–226.
- Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3 (2006), Springer, pp. 265–284.
- The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science (2014), 211–407.
- Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security (2014), pp. 1054–1067.
- Limiting privacy breaches in privacy preserving data mining. In Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (2003), pp. 211–222.
- Neither private nor fair: Impact of data imbalance on utility and fairness in differential privacy. In Proceedings of the 2020 workshop on privacy-preserving machine learning in practice (2020), pp. 15–19.
- Certifying and removing disparate impact. In proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (2015), pp. 259–268.
- Differential privacy and fairness in decisions and learning tasks: A survey. arXiv preprint arXiv:2202.08187 (2022).
- Fischer, I. The conditional entropy bottleneck. Entropy 22, 9 (2020), 999.
- Data mining with differential privacy. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (2010), pp. 493–502.
- Equality of opportunity in supervised learning. In NIPS (2016), pp. 3315–3323.
- Differentially private fair learning. In International Conference on Machine Learning (2019), PMLR, pp. 3000–3008.
- Fairness without imputation: A decision tree approach for fair prediction with missing values. In Proceedings of the AAAI Conference on Artificial Intelligence (2022), vol. 36, pp. 9558–9566.
- Fairness-aware learning through regularization approach. In 2011 IEEE 11th International Conference on Data Mining Workshops (2011), IEEE, pp. 643–650.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).
- Algorithmic fairness. In Aea papers and proceedings (2018), vol. 108, pp. 22–27.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278–2324.
- Gradient estimators for implicit models. arXiv preprint arXiv:1705.07107 (2017).
- The variational fair autoencoder. arXiv preprint arXiv:1511.00830 (2015).
- Learning adversarially fair and transferable representations. In International Conference on Machine Learning (2018), PMLR, pp. 3384–3393.
- A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR) 54, 6 (2021), 1–35.
- Fair learning with private demographic data. In International Conference on Machine Learning (2020), PMLR, pp. 7066–7075.
- Variational approach for privacy funnel optimization on continuous data. Journal of Parallel and Distributed Computing 137 (2020), 17–25.
- Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).
- On variational bounds of mutual information. In International Conference on Machine Learning (2019), PMLR, pp. 5171–5180.
- Fair decision making using privacy-protected data. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (2020), pp. 189–199.
- On perfect privacy. IEEE Journal on Selected Areas in Information Theory 2, 1 (2021), 177–191.
- On perfect obfuscation: Local information geometry analysis. In 2020 IEEE International Workshop on Information Forensics and Security (WIFS) (2020), IEEE, pp. 1–6.
- A variational approach to privacy and fairness. In 2021 IEEE Information Theory Workshop (ITW) (2021), IEEE, pp. 1–6.
- Fairness properties of face recognition and obfuscation systems. In 32nd USENIX Security Symposium (USENIX Security 23) (2023), pp. 7231–7248.
- Fairness by learning orthogonal disentangled representations. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIX 16 (2020), Springer, pp. 746–761.
- The deterministic information bottleneck. Neural computation 29, 6 (2017), 1611–1630.
- The information bottleneck method. arXiv preprint physics/0004057 (2000).
- Differentially private and fair deep learning: A lagrangian dual approach. In Proceedings of the AAAI Conference on Artificial Intelligence (2021), vol. 35, pp. 9932–9939.
- Neural discrete representation learning. Advances in neural information processing systems 30 (2017).
- Demystifying the draft eu artificial intelligence act—analysing the good, the bad, and the unclear elements of the proposed approach. Computer Law Review International 22, 4 (2021), 97–112.
- The eu general data protection regulation (gdpr). A Practical Guide, 1st Ed., Cham: Springer International Publishing 10, 3152676 (2017), 10–5555.
- Warner, S. L. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association 60, 309 (1965), 63–69.
- Federated learning with differential privacy: Algorithms and performance analysis. IEEE Transactions on Information Forensics and Security 15 (2020), 3454–3469.
- Mutual information gradient estimation for representation learning. arXiv preprint arXiv:2005.01123 (2020).
- Robust information bottleneck for task-oriented communication with digital modulation. arXiv preprint arXiv:2209.10382 (2022).
- Fairness constraints: Mechanisms for fair classification. In Artificial intelligence and statistics (2017), PMLR, pp. 962–970.
- Learning fair representations. In International conference on machine learning (2013), PMLR, pp. 325–333.
- Conditional learning of fair representations. arXiv preprint arXiv:1910.07162 (2019).
- Songjie Xie (7 papers)
- Youlong Wu (44 papers)
- Jiaxuan Li (52 papers)
- Ming Ding (219 papers)
- Khaled B. Letaief (209 papers)