Fair Multivariate Adaptive Regression Splines for Ensuring Equity and Transparency (2402.15561v1)
Abstract: Predictive analytics is widely used in various domains, including education, to inform decision-making and improve outcomes. However, many predictive models are proprietary and inaccessible for evaluation or modification by researchers and practitioners, limiting their accountability and ethical design. Moreover, predictive models are often opaque and incomprehensible to the officials who use them, reducing their trust and utility. Furthermore, predictive models may introduce or exacerbate bias and inequity, as they have done in many sectors of society. Therefore, there is a need for transparent, interpretable, and fair predictive models that can be easily adopted and adapted by different stakeholders. In this paper, we propose a fair predictive model based on multivariate adaptive regression splines(MARS) that incorporates fairness measures in the learning process. MARS is a non-parametric regression model that performs feature selection, handles non-linear relationships, generates interpretable decision rules, and derives optimal splitting criteria on the variables. Specifically, we integrate fairness into the knot optimization algorithm and provide theoretical and empirical evidence of how it results in a fair knot placement. We apply our fairMARS model to real-world data and demonstrate its effectiveness in terms of accuracy and equity. Our paper contributes to the advancement of responsible and ethical predictive analytics for social good.
- Fair regression: Quantitative definitions and reduction-based algorithms. In International Conference on Machine Learning, 120–129. PMLR.
- Agarwal, S. 2021. Trade-offs between fairness and interpretability in machine learning. In IJCAI 2021 Workshop on AI for Social Good.
- Learning optimal and fair decision trees for non-discriminative decision-making. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01): 1418–1426.
- Big data’s disparate impact. Calif. L. Rev., 104: 671.
- A convex framework for fair regression. arXiv preprint arXiv:1706.02409.
- Profiling linked open data with ProLOD. In 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010), 175–178. IEEE.
- Education Longitudinal Study of 2002 (ELS: 2002): A First Look at the Initial Postsecondary Experiences of the High School Sophomore Class of 2002. National Center for Education Statistics.
- FairVis: Visual analytics for discovering intersectional bias in machine learning. In 2019 IEEE Conference on Visual Analytics Science and Technology (VAST), 46–56. IEEE.
- Building classifiers with independency constraints. In 2009 IEEE International Conference on Data Mining Workshops, 13–18. IEEE.
- Controlling Attribute Effect in Linear Regression. 2013 IEEE 13th International Conference on Data Mining, 71–80.
- Optimized pre-processing for discrimination prevention. In Advances in Neural Information Processing Systems, 3992–4001.
- Chouldechova, A. 2017. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 5(2): 153–163.
- Cortez, P. 2014. Student Performance. https://archive.ics.uci.edu/dataset/320/student+performance. Accessed: 2024-01-24.
- FairGBM: Gradient Boosting with Fairness Constraints. arXiv:2209.07850.
- UCI Machine Learning Repository. http://archive.ics.uci.edu/ml. Accessed: 2023-01-18.
- Fairness through awareness. In ITCS, 214–226.
- Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 259–268. ACM.
- A comparative study of fairness-enhancing interventions in machine learning. In Proceedings of the Conference on Fairness, Accountability, and Transparency, 329–338. ACM.
- Friedman, J. H. 1991. Multivariate adaptive regression splines. The annals of statistics, 1–67.
- Fairness-aware explainable recommendation over knowledge graphs. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 69–78.
- Haghighat, P. 2024. fairMARS: Py-earth Package Enhanced with Fairness Components for fair Multivariate Adaptive Regression Splines (MARS) implementation.
- Equality of opportunity in supervised learning. In Advances in neural information processing systems, 3315–3323.
- Fair classification and social welfare. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 535–545.
- Fairness in reinforcement learning. arXiv preprint arXiv:1611.03071.
- Classifying without discriminating. In 2009 2nd International Conference on Computer, Control and Communication, 1–6. IEEE.
- Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems, 33(1): 1–33.
- Fairness-aware classifier with prejudice remover regularizer. In ECML PKDD, 35–50. Springer.
- Stability of Feature Selection Algorithm: A Review. Journal of King Saud University - Computer and Information Sciences, 34(4): 1060–1073.
- Avoiding discrimination through causal reasoning. In Advances in Neural Information Processing Systems, 656–666.
- Algorithmic fairness. In Aea papers and proceedings, volume 108, 22–27.
- Nonconvex optimization for regression with fairness constraints. In International conference on machine learning, 2737–2746.
- Explainable ai: A review of machine learning interpretability methods. Entropy, 23(1): 18.
- On the applicability of machine learning fairness notions. ACM SIGKDD Explorations Newsletter, 23(1): 14–23.
- Assessing Disparities in Predictive Modeling Outcomes for College Student Success: The Impact of Imputation Techniques on Model Performance and Fairness. Education Sciences, 14(2).
- FairLens: Auditing black-box clinical decision support systems. Information Processing & Management, 58(5): 102657.
- Interpretable machine learning: Fundamental principles and 10 grand challenges. Statistic Surveys, 16: 1–85.
- Rudy, J. 2016. py-earth: a Python implementation of Jerome Friedman’s multivariate adaptive regression splines.
- The problem of infra-marginality in outcome tests for discrimination. The Annals of Applied Statistics, 11(3): 1193–1216.
- Interpretability of machine learning-based prediction models in healthcare. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 10(5): e1379.
- Stine, K. D. J. P. F. A. 2022. Impartial Predictive Modeling and the Use of Proxy Variables. arXiv:1608.00528.
- A bayesian framework for learning rule sets for interpretable classification. The Journal of Machine Learning Research, 18(1): 2357–2393.
- Towards Accurate and Fair Prediction of College Success: Evaluating Different Sources of Student Data. International educational data mining society.
- Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web, 1171–1180. International World Wide Web Conferences Steering Committee.
- Fairness constraints: Mechanisms for fair classification. arXiv preprint arXiv:1507.05259.
- Graying the black box: Understanding DQNs. In Balcan, M. F.; and Weinberger, K. Q., eds., Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, 1899–1908. New York, New York, USA: PMLR.
- Relational Deep Reinforcement Learning. arXiv:1806.01830.
- Fa* ir: A fair top-k ranking algorithm. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 1569–1578. ACM.
- Learning fair representations. In International Conference on Machine Learning, 325–333.
- Parian Haghighat (2 papers)
- Denisa G'andara (1 paper)
- Lulu Kang (23 papers)
- Hadis Anahideh (14 papers)