Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

f-FERM: A Scalable Framework for Robust Fair Empirical Risk Minimization (2312.03259v2)

Published 6 Dec 2023 in cs.LG

Abstract: Training and deploying machine learning models that meet fairness criteria for protected groups are fundamental in modern artificial intelligence. While numerous constraints and regularization terms have been proposed in the literature to promote fairness in machine learning tasks, most of these methods are not amenable to stochastic optimization due to the complex and nonlinear structure of constraints and regularizers. Here, the term "stochastic" refers to the ability of the algorithm to work with small mini-batches of data. Motivated by the limitation of existing literature, this paper presents a unified stochastic optimization framework for fair empirical risk minimization based on f-divergence measures (f-FERM). The proposed stochastic algorithm enjoys theoretical convergence guarantees. In addition, our experiments demonstrate the superiority of fairness-accuracy tradeoffs offered by f-FERM for almost all batch sizes (ranging from full-batch to batch size of one). Moreover, we show that our framework can be extended to the case where there is a distribution shift from training to the test data. Our extension is based on a distributionally robust optimization reformulation of f-FERM objective under $L_p$ norms as uncertainty sets. Again, in this distributionally robust setting, f-FERM not only enjoys theoretical convergence guarantees but also outperforms other baselines in the literature in the tasks involving distribution shifts. An efficient stochastic implementation of $f$-FERM is publicly available.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (79)
  1. C. R. Act. Civil rights act of 1964. Title VII, Equal Employment Opportunities, 1964.
  2. Learning optimal and fair decision trees for non-discriminative decision-making. In Proceedings of the AAAI conference on artificial intelligence, pages 1418–1426, 2019.
  3. Interpretable machine learning in healthcare. In Proceedings of the 2018 ACM international conference on bioinformatics, computational biology, and health informatics, pages 559–560, 2018.
  4. A. Ajalloeian and S. U. Stich. Analysis of SGD with biased gradient estimators. CoRR, abs/2008.00051, 2020. URL https://arxiv.org/abs/2008.00051.
  5. Beyond adult and compas: Fair multi-class prediction via information projection. Advances in Neural Information Processing Systems, 35:38747–38760, 2022.
  6. Machine bias. In Ethics of Data and Analytics, pages 254–264. Auerbach Publications, 2016.
  7. Invariant risk minimization. arXiv preprint arXiv:1907.02893, 2019.
  8. Rényi fair inference. In International Conference on Learning Representations, 2020.
  9. Improving adversarial robustness via joint classification and multiple explicit detection classes. In F. Ruiz, J. Dy, and J.-W. van de Meent, editors, Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, volume 206 of Proceedings of Machine Learning Research, pages 11059–11078. PMLR, 25–27 Apr 2023.
  10. Metareg: Towards domain generalization using meta-regularization. Advances in neural information processing systems, 31, 2018.
  11. B. Becker and R. Kohavi. Adult. UCI Machine Learning Repository, 1996. DOI: https://doi.org/10.24432/C5XW20.
  12. Square-root lasso: pivotal recovery of sparse signals via conic programming. Biometrika, 98(4):791–806, 2011.
  13. Robust wasserstein profile inference and applications to machine learning. Journal of Applied Probability, 56(3):830–857, 2019.
  14. Classifying online job advertisements through machine learning. Future Generation Computer Systems, 86:319–328, 2018.
  15. J. Buolamwini and T. Gebru. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency, pages 77–91. PMLR, 2018.
  16. N. Carlini and D. Wagner. Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP), pages 39–57, 2017. doi: 10.1109/SP.2017.49.
  17. A clarification of the nuances in the fairness metrics landscape. Scientific Reports, 12(1):4209, 2022.
  18. J. Chen and R. Luss. Stochastic gradient descent with biased but consistent gradient estimators. CoRR, abs/1807.11880, 2018. URL http://arxiv.org/abs/1807.11880.
  19. A fair classifier using kernel density estimation. Advances in neural information processing systems, 33:15088–15099, 2020.
  20. J. Dai and S. M. Brown. Label bias, label shift: Fair machine learning with unreliable labels. In NeurIPS 2020 Workshop on Consequential Decision Making in Dynamic Environments, volume 12, 2020.
  21. The complexity of constrained min-max optimization. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 1466–1478, 2021.
  22. Distributed personalized empirical risk minimization. In International Workshop on Federated Learning for Distributed Data Mining, 2023.
  23. Retiring adult: New datasets for fair machine learning. Advances in Neural Information Processing Systems, 34:6478–6490, 2021.
  24. Empirical risk minimization under fairness constraints. Advances in neural information processing systems, 31, 2018.
  25. W. Du and X. Wu. Fair and robust classification under sample selection bias. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pages 2999–3003, 2021.
  26. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pages 214–226, 2012.
  27. G. Elford. Equality of Opportunity. 2023.
  28. Rethinking importance weighting for deep learning under distribution shift. Advances in Neural Information Processing Systems, 33:11996–12007, 2020.
  29. Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412, 2020.
  30. Fairness guarantees under demographic shift. In International Conference on Learning Representations, 2021.
  31. Fairness-aware neural rényi minimization for continuous features. In Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {normal-{\{{IJCAI-PRICAI-20}normal-}\}}, pages 2262–2268. International Joint Conferences on Artificial Intelligence Organization, 2020.
  32. Equality of opportunity in supervised learning. Advances in neural information processing systems, 29, 2016.
  33. H. Husain. Distributional robustness with ipms and links to regularization and gans. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 11816–11827. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/8929c70f8d710e412d38da624b21c3c8-Paper.pdf.
  34. Theory of extremal problems. Elsevier, 2009.
  35. Wasserstein fair classification. In Uncertainty in artificial intelligence, pages 862–872. PMLR, 2020.
  36. F. Kamiran and T. Calders. Data preprocessing techniques for classification without discrimination. Knowledge and information systems, 33(1):1–33, 2012.
  37. W. Kong and R. D. Monteiro. An accelerated inexact proximal point method for solving nonconvex-concave min-max problems. SIAM Journal on Optimization, 31(4):2558–2585, 2021.
  38. Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6):84–90, 2017.
  39. Wasserstein distributionally robust optimization: Theory and applications in machine learning. In Operations research & management science in the age of analytics, pages 130–166. Informs, 2019.
  40. Impossibility results for fair representations. arXiv preprint arXiv:2107.03483, 2021.
  41. Large-scale methods for distributionally robust optimization. Advances in Neural Information Processing Systems, 33:8847–8860, 2020.
  42. Nonsmooth nonconvex-nonconcave minimax optimization: Primal-dual balancing and iteration complexity analysis, 2023.
  43. Tilted empirical risk minimization. arXiv preprint arXiv:2007.01162, 2020.
  44. On gradient descent ascent for nonconvex-concave minimax problems. In H. D. III and A. Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 6083–6093. PMLR, 13–18 Jul 2020. URL https://proceedings.mlr.press/v119/lin20a.html.
  45. A stochastic optimization framework for fair risk minimization. tmlr, 2022.
  46. Stochastic recursive gradient descent ascent for stochastic nonconvex-strongly-concave minimax problems. Advances in Neural Information Processing Systems, 33:20566–20577, 2020.
  47. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
  48. Does enforcing fairness mitigate biases caused by subpopulation shift? Advances in Neural Information Processing Systems, 34:25773–25784, 2021.
  49. Fine-tuning language models with just forward passes. arXiv preprint arXiv:2305.17333, 2023.
  50. Fairness-aware learning for continuous attributes and treatments. In K. Chaudhuri and R. Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 4382–4391. PMLR, 09–15 Jun 2019. URL https://proceedings.mlr.press/v97/mary19a.html.
  51. A. Mishler and N. Dalmasso. Fair when trained, unfair when deployed: Observable fairness measures are unstable in performative prediction settings. arXiv preprint arXiv:2202.05049, 2022.
  52. F. Nielsen and R. Nock. On the chi square and higher-order chi distances for approximating f-divergences. CoRR, abs/1309.3029, 2013. URL http://arxiv.org/abs/1309.3029.
  53. Solving a class of non-convex min-max games using iterative first order methods. Advances in Neural Information Processing Systems, 32, 2019.
  54. Nonconvex-nonconcave min-max optimization with a small maximization domain. arXiv preprint arXiv:2110.03950, 2021a.
  55. Efficient search of first-order nash equilibria in nonconvex-concave smooth min-max problems. SIAM Journal on Optimization, 31(4):2508–2538, 2021b.
  56. Y. Polyanskiy and Y. Wu. Information theory: From coding to learning. Book draft, 2022.
  57. Toward a better trade-off between performance and fairness with kernel-based distribution matching. arXiv preprint arXiv:1910.11779, 2019.
  58. Non-convex min–max optimization: provable algorithms and applications in machine learning (2018). arXiv preprint arXiv:1810.02060, 1810.
  59. Weakly-convex–concave min–max optimization: provable algorithms and applications in machine learning. Optimization Methods and Software, 37(3):1087–1121, 2022.
  60. Nonconvex min-max optimization: Applications, challenges, and recent theoretical advances. IEEE Signal Processing Magazine, 37(5):55–66, 2020a. doi: 10.1109/MSP.2020.3003851.
  61. Nonconvex min-max optimization: Applications, challenges, and recent theoretical advances. IEEE Signal Processing Magazine, 37(5):55–66, 2020b.
  62. Robust fairness under covariate shift. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 9419–9427, 2021.
  63. Maintaining fairness across distribution shift: do we have viable solutions for real-world applications? arXiv preprint arXiv:2202.01034, 2022.
  64. On learning fairness and accuracy on multiple subgroups. Advances in Neural Information Processing Systems, 35:34121–34135, 2022.
  65. Fairness violations and mitigation under covariate shift. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pages 3–13, 2021.
  66. Certifying some distributional robustness with principled adversarial training. In International Conference on Learning Representations, 2018.
  67. M. Staib and S. Jegelka. Distributionally robust optimization and generalization in kernel methods. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper_files/paper/2019/file/1770ae9e1b6bc9f5fd2841f141557ffb-Paper.pdf.
  68. A distributionally robust approach to fair classification, 2020.
  69. Efficient algorithms for smooth minimax optimization. Advances in Neural Information Processing Systems, 32, 2019.
  70. Fairness without harm: Decoupled classifiers with preference guarantees. In International Conference on Machine Learning, pages 6373–6382. PMLR, 2019.
  71. Modeling techniques for machine learning fairness: A survey. CoRR, abs/2111.03015, 2021. URL https://arxiv.org/abs/2111.03015.
  72. How robust is your fairness? evaluating and sustaining fairness under unseen distribution shifts. Transactions on Machine Learning Research, 2023. ISSN 2835-8856. URL https://openreview.net/forum?id=11pGlecTz2.
  73. Robust optimization for fairness with noisy protected groups. Advances in neural information processing systems, 33:5190–5203, 2020.
  74. Machine learning and deep learning methods for cybersecurity. Ieee access, 6:35365–35381, 2018.
  75. Learning domain-invariant subspace using domain features and independence maximization. IEEE transactions on cybernetics, 48(1):288–299, 2017.
  76. Fairness constraints: Mechanisms for fair classification. In Artificial intelligence and statistics, pages 962–970. PMLR, 2017.
  77. Learning fair representations. In International conference on machine learning, pages 325–333. PMLR, 2013.
  78. Sapd+: An accelerated stochastic method for nonconvex-concave minimax problems. Advances in Neural Information Processing Systems, 35:21668–21681, 2022.
  79. M. Zhong and R. Tandon. Learning fair classifiers via min-max f-divergence regularization, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Sina Baharlouei (8 papers)
  2. Shivam Patel (9 papers)
  3. Meisam Razaviyayn (76 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.