Fair MP-BOOST: Fair and Interpretable Minipatch Boosting (2404.01521v1)
Abstract: Ensemble methods, particularly boosting, have established themselves as highly effective and widely embraced machine learning techniques for tabular data. In this paper, we aim to leverage the robust predictive power of traditional boosting methods while enhancing fairness and interpretability. To achieve this, we develop Fair MP-Boost, a stochastic boosting scheme that balances fairness and accuracy by adaptively learning features and observations during training. Specifically, Fair MP-Boost sequentially samples small subsets of observations and features, termed minipatches (MP), according to adaptively learned feature and observation sampling probabilities. We devise these probabilities by combining loss functions, or by combining feature importance scores to address accuracy and fairness simultaneously. Hence, Fair MP-Boost prioritizes important and fair features along with challenging instances, to select the most relevant minipatches for learning. The learned probability distributions also yield intrinsic interpretations of feature importance and important observations in Fair MP-Boost. Through empirical evaluation of simulated and benchmark datasets, we showcase the interpretability, accuracy, and fairness of Fair MP-Boost.
- “A review on fairness in machine learning,” ACM Computing Surveys (CSUR), vol. 55, no. 3, pp. 1–44, 2022.
- “A survey on machine learning: concept, algorithms and applications,” International Journal of Innovative Research in Computer and Communication Engineering, vol. 5, no. 2, pp. 1301–1309, 2017.
- “Algorithmic fairness,” in American Economic Association papers and proceedings, 2018, vol. 108, pp. 22–27.
- “Data augmentation via subgroup mixup for improving fairness,” arXiv preprint arXiv:2309.07110, 2023.
- “Fairgbm: Gradient boosting with fairness constraints,” arXiv preprint arXiv:2209.07850, 2022.
- “Interpretable machine learning for discovery: Statistical challenges and opportunities,” Annual Review of Statistics and Its Application, vol. 11, 2023.
- “Fair adversarial gradient tree boosting,” in ICDM 2019: Proceedings of the 2019 IEEE International Conference on Data Mining, 2019, pp. 1060–1065.
- “MP-Boost: Minipatch boosting via adaptive feature and observation sampling,” in BigComp 2021: Proceedings of the 2021 IEEE International Conference on Big Data and Smart Computing, 2021, pp. 75–78.
- “Feature selection for huge data via minipatch learning,” ArXiv Pre-Print 2010.08529, 2020.
- Leo Breiman, Classification and regression trees, Routledge, 1973.
- “Fair feature importance scores for interpreting tree-based methods and surrogates,” arXiv preprint arXiv:2310.04352, 2023.
- “Data decisions and theoretical implications when adversarially learning fair representations,” ArXiv Pre-Print 1707.00075, 2017.
- “To the fairness frontier and beyond: Identifying, quantifying, and optimizing the fairness-accuracy Pareto frontier,” arXiv preprint arXiv:2206.00074, 2022.
- “Mitigating unwanted biases with adversarial learning,” in AIES 2018: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, 2018, p. 335–340.
- “Techniques for interpretable machine learning,” Communications of the ACM, vol. 63, no. 1, pp. 68–77, 2019.
- Linda Whiteman, “The scale and effects of admissions preferences in higher education (SEAPHE),” 1998.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.