Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Predicting Fairness of ML Software Configurations (2404.19100v2)

Published 29 Apr 2024 in cs.SE, cs.AI, cs.CY, and cs.LG

Abstract: This paper investigates the relationships between hyperparameters of machine learning and fairness. Data-driven solutions are increasingly used in critical socio-technical applications where ensuring fairness is important. Rather than explicitly encoding decision logic via control and data structures, the ML developers provide input data, perform some pre-processing, choose ML algorithms, and tune hyperparameters (HPs) to infer a program that encodes the decision logic. Prior works report that the selection of HPs can significantly influence fairness. However, tuning HPs to find an ideal trade-off between accuracy, precision, and fairness has remained an expensive and tedious task. Can we predict fairness of HP configuration for a given dataset? Are the predictions robust to distribution shifts? We focus on group fairness notions and investigate the HP space of 5 training algorithms. We first find that tree regressors and XGBoots significantly outperformed deep neural networks and support vector machines in accurately predicting the fairness of HPs. When predicting the fairness of ML hyperparameters under temporal distribution shift, the tree regressors outperforms the other algorithms with reasonable accuracy. However, the precision depends on the ML training algorithm, dataset, and protected attributes. For example, the tree regressor model was robust for training data shift from 2014 to 2018 on logistic regression and discriminant analysis HPs with sex as the protected attribute; but not for race and other training algorithms. Our method provides a sound framework to efficiently perform fine-tuning of ML training algorithms and understand the relationships between HPs and fairness.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Black Box Fairness Testing of Machine Learning Models. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). 625–635. https://doi.org/10.1145/3338906.3338937
  2. Andrea Arcuri and Lionel Briand. 2014. A Hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. Software Testing, Verification and Reliability (2014), 219–250. https://doi.org/10.1002/stvr.1486
  3. AI Fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias. IBM Journal of Research and Development 63, 4/5 (2019), 4–1.
  4. A comparative analysis of gradient boosting algorithms. Artificial Intelligence Review 54 (2021), 1937–1967.
  5. Sumon Biswas and Hridesh Rajan. 2021. Fair Preprocessing: Towards Understanding Compositional Fairness of Data Transformers in Machine Learning Pipeline. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Athens, Greece) (ESEC/FSE 2021). Association for Computing Machinery, New York, NY, USA, 981–993. https://doi.org/10.1145/3468264.3468536
  6. Ruth G Blumrosen. 1978. Wage discrimination, job segregation, and the title vii of the civil rights act of 1964. U. Mich. JL Reform 12 (1978), 397.
  7. Bias in machine learning software: Why? how? what to do?. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 429–440.
  8. Fairway: A Way to Build Fair ML Software. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Virtual Event, USA) (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA, 654–665. https://doi.org/10.1145/3368089.3409697
  9. Software engineering for fairness: A case study with hyperparameter optimization. arXiv preprint arXiv:1905.05786 (2019).
  10. Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 785–794.
  11. Support vector regression machines. Advances in neural information processing systems 9 (1996).
  12. Dheeru Dua and Casey Graff. 2017a. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/census+income
  13. Dheeru Dua and Casey Graff. 2017b. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data)
  14. Dheeru Dua and Casey Graff. 2017c. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/bank+marketing
  15. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference. 214–226.
  16. Explanation-Guided Fairness Testing through Genetic Algorithm. In Proceedings of the 44th International Conference on Software Engineering (ICSE ’22). Association for Computing Machinery, New York, NY, USA, 871–882. https://doi.org/10.1145/3510003.3510137
  17. Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics (2001), 1189–1232.
  18. Fairness Testing: Testing Software for Discrimination (ESEC/FSE 2017). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3106237.3106277
  19. Towards Understanding Fairness and Its Composition in Ensemble Machine Learning. In Proceedings of the 45th International Conference on Software Engineering (Melbourne, Victoria, Australia) (ICSE ’23). IEEE Press, 1533–1545. https://doi.org/10.1109/ICSE48619.2023.00133
  20. Home Credit Group. 2018. Home Credit Default Risk. https://www.kaggle.com/competitions/home-credit-default-risk/overview.
  21. Equality of Opportunity in Supervised Learning. In NIPS.
  22. David Ingold and Spencer Soper. 2016. Amazon Doesn’t Consider the Race of Its Customers. Should It? https://www.bloomberg.com/graphics/2016-amazon-same-day/. Online.
  23. Surya Mattu Julia Angwin, Jeff Larson and Lauren Kirchne. 2021. Machine Bias. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing. Online.
  24. Arjun Kharpal. 2018. TECH Health care start-up says A.I. can diagnose patients better than humans can, doctors call that ‘dubious’. CNBC (2018). https://www.cnbc.com/2018/06/28/babylon-claims-its-ai-can-diagnose-patients-better-than-doctors.html
  25. Applied predictive modeling. Vol. 26. Springer.
  26. Meelis Kull and Peter Flach. 2014. Patterns of dataset shift. In First international workshop on learning over multiple contexts (LMCE) at ECML-PKDD, Vol. 5.
  27. Wei-Yin Loh. 2011. Classification and regression trees. Wiley interdisciplinary reviews: data mining and knowledge discovery 1, 1 (2011), 14–23.
  28. Evolving deep neural networks. In Artificial intelligence in the age of neural networks and brain computing. Elsevier, 293–312.
  29. Information-theoretic testing and debugging of fairness defects in deep neural networks. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 1571–1582.
  30. How to Make Machine Learning Financial Recommendations More Fair: Theoretical Explanation of Empirical Results. (2024).
  31. ProPublica. 2021. Compas Software Ananlysis. https://github.com/propublica/compas-analysis. Online.
  32. Large-scale evolution of image classifiers. In International Conference on Machine Learning. PMLR, 2902–2911.
  33. scikit learn. 2021a. Decision Tree Classifier. https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html. Online.
  34. scikit learn. 2021b. Discriminant Analysis. https://scikit-learn.org/stable/modules/lda_qda.html. Online.
  35. scikit learn. 2021c. Logistic Regression. https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html. Online.
  36. scikit learn. 2021d. Support Vector Machine. https://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html. Online.
  37. Kenneth O Stanley and Risto Miikkulainen. 2002. Evolving neural networks through augmenting topologies. Evolutionary computation 10, 2 (2002), 99–127.
  38. Jim Tankersley. 2023. Black Americans Are Much More Likely to Face Tax Audits, Study Finds. The New York Times (31 Jan 2023). https://www.nytimes.com/2023/01/31/us/politics/black-americans-irs-tax-audits.html
  39. How to Gauge Inequality and Fairness: A Complete Description of All Decomposable Versions of Theil Index. (2024).
  40. Fairness-aware configuration of machine learning libraries. In Proceedings of the 44th International Conference on Software Engineering. 909–920.
  41. Metamorphic Testing and Debugging of Tax Preparation Software. In 2023 IEEE/ACM 45th International Conference on Software Engineering: Software Engineering in Society (ICSE-SEIS). 138–149. https://doi.org/10.1109/ICSE-SEIS58686.2023.00019
  42. Detecting and understanding real-world differential performance bugs in machine learning libraries. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2020). 189–199. https://doi.org/10.1145/3395363.3404540
  43. Kush R Varshney. 2021. Trustworthy machine learning. Chappaqua, NY (2021).
  44. Lingxi Xie and Alan Yuille. 2017. Genetic cnn. In Proceedings of the IEEE international conference on computer vision. 1379–1388.
  45. Jie M Zhang and Mark Harman. 2021. ”Ignorance and Prejudice” in Software Fairness. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 1436–1447.
  46. Machine learning testing: Survey, landscapes and horizons. IEEE Transactions on Software Engineering 48, 1 (2020), 1–36.
  47. White-box fairness testing through adversarial sampling. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 949–960.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Salvador Robles Herrera (1 paper)
  2. Verya Monjezi (4 papers)
  3. Vladik Kreinovich (9 papers)
  4. Ashutosh Trivedi (76 papers)
  5. Saeid Tizpaz-Niari (22 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com