Enhancing Fairness and Performance in Machine Learning Models: A Multi-Task Learning Approach with Monte-Carlo Dropout and Pareto Optimality (2404.08230v2)
Abstract: Bias originates from both data and algorithmic design, often exacerbated by traditional fairness methods that fail to address the subtle impacts of protected attributes. This study introduces an approach to mitigate bias in machine learning by leveraging model uncertainty. Our approach utilizes a multi-task learning (MTL) framework combined with Monte Carlo (MC) Dropout to assess and mitigate uncertainty in predictions related to protected labels. By incorporating MC Dropout, our framework quantifies prediction uncertainty, which is crucial in areas with vague decision boundaries, thereby enhancing model fairness. Our methodology integrates multi-objective learning through pareto-optimality to balance fairness and performance across various applications. We demonstrate the effectiveness and transferability of our approach across multiple datasets and enhance model explainability through saliency maps to interpret how input features influence predictions, thereby enhancing the interpretability of machine learning models in practical applications.
- Tom M Mitchell. The need for biases in learning generalizations. Department of Computer Science, Laboratory for Computer Science Research …, 1980.
- Bias reducing multitask learning on mental health prediction. In 2022 10th International Conference on Affective Computing and Intelligent Interaction (ACII), pages 1–8. IEEE, 2022.
- Algorithmic factors influencing bias in machine learning. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 559–574. Springer, 2021.
- Underestimation bias and underfitting in machine learning. In International Workshop on the Foundations of Trustworthy AI Integrating Learning, Optimization and Reasoning, pages 20–31. Springer, 2021.
- Fairness-aware classifier with prejudice remover regularizer. In Joint European conference on machine learning and knowledge discovery in databases, pages 35–50. Springer, 2012.
- Gender de-biasing in speech emotion recognition. In INTERSPEECH, pages 2823–2827, 2019.
- Shannon Vallor. Artificial intelligence and public trust. 2017.
- A comparative study of fairness-enhancing interventions in machine learning. In Proceedings of the conference on fairness, accountability, and transparency, pages 329–338, 2019.
- Latanya Sweeney. Discrimination in online ad delivery: Google ads, black names and white names, racial discrimination, and click advertising. Queue, 11(3):10–29, 2013.
- Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464):447–453, 2019.
- Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334):183–186, 2017.
- Bias and unfairness in machine learning models: a systematic literature review. arXiv preprint arXiv:2202.08176, 2022.
- Is there a trade-off between fairness and accuracy? a perspective using mismatched hypothesis testing. In International Conference on Machine Learning, pages 2803–2813. PMLR, 2020.
- The fairness-accuracy pareto front. Statistical Analysis and Data Mining: The ASA Data Science Journal, 15(3):287–302, 2022.
- Accuracy and fairness trade-offs in machine learning: A stochastic multi-objective approach. Computational Management Science, pages 1–25, 2022.
- Understanding and improving fairness-accuracy trade-offs in multi-task learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pages 1748–1757, 2021a.
- Algorithmic design: Fairness versus accuracy. In Proceedings of the 23rd ACM Conference on Economics and Computation, pages 58–59, 2022.
- To the fairness frontier and beyond: Identifying, quantifying, and optimizing the fairness-accuracy pareto frontier. arXiv preprint arXiv:2206.00074, 2022.
- Pareto efficient fairness in supervised learning: From extraction to tracing. arXiv preprint arXiv:2104.01634, 2021.
- Taking advantage of multitask learning for fair classification. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pages 227–237, 2019.
- On fairness and calibration. Advances in neural information processing systems, 30, 2017.
- Aequitas: A bias and fairness audit toolkit. arXiv preprint arXiv:1811.05577, 2018.
- How do fairness definitions fare? examining public attitudes towards algorithmic definitions of fairness. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pages 99–106, 2019.
- A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR), 54(6):1–35, 2021.
- Three naive bayes approaches for discrimination-free classification. Data mining and knowledge discovery, 21(2):277–292, 2010.
- Alexandra Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 5(2):153–163, 2017.
- Certifying and removing disparate impact. In proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pages 259–268, 2015.
- Equality of opportunity in supervised learning. Advances in neural information processing systems, 29, 2016.
- Data preprocessing techniques for classification without discrimination. Knowledge and information systems, 33(1):1–33, 2012.
- Fairness constraints: Mechanisms for fair classification. In Artificial intelligence and statistics, pages 962–970. PMLR, 2017.
- Mitigating unwanted biases with adversarial learning. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pages 335–340, 2018.
- Conscientious classification: A data scientist’s guide to discrimination-aware classification. Big data, 5(2):120–134, 2017.
- Mitigating biases in multimodal personality assessment. In Proceedings of the 2020 International Conference on Multimodal Interaction, pages 361–369, 2020.
- Analyzing bias in sensitive personal information used to train financial models. arXiv preprint arXiv:1911.03623, 2019.
- Constructing a fair classifier with generated fair data. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 7908–7916, 2021.
- Mitigating demographic bias in facial datasets with style-based multi-attribute transfer. International Journal of Computer Vision, 129(7):2288–2307, 2021.
- Synthetic data in machine learning for medicine and healthcare. Nature Biomedical Engineering, 5(6):493–497, 2021.
- Vital-ecg: a de-bias algorithm embedded in a gender-immune device. In 2020 IEEE International Workshop on Metrology for Industry 4.0 & IoT, pages 314–318. IEEE, 2020.
- Fair models for impartial policies: Controlling algorithmic bias in transport behavioural modelling. Sustainability, 14(14):8416, 2022.
- Adaptive sensitive reweighting to mitigate bias in fairness-aware classification. In Proceedings of the 2018 world wide web conference, pages 853–862, 2018.
- Enhancing fairness in face detection in computer vision systems by demographic bias mitigation. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, pages 813–822, 2022.
- Enforcing fairness in logistic regression algorithm. In 2020 International Conference on INnovations in Intelligent SysTems and Applications (INISTA), pages 1–7. IEEE, 2020.
- Fairness metrics and bias mitigation strategies for rating predictions. Information Processing & Management, 58(5):102646, 2021.
- Fairness via representation neutralization. Advances in Neural Information Processing Systems, 34:12091–12103, 2021.
- Fair outlier detection based on adversarial representation learning. Symmetry, 14(2):347, 2022.
- Fair adversarial gradient tree boosting. In 2019 IEEE International Conference on Data Mining (ICDM), pages 1060–1065. IEEE, 2019.
- One-network adversarial fairness. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 2412–2420, 2019.
- Adversarial machine learning. In Proceedings of the 4th ACM workshop on Security and artificial intelligence, pages 43–58, 2011.
- Deep bayesian gaussian processes for uncertainty estimation in electronic health records. Scientific reports, 11(1):1–13, 2021.
- A hybrid framework for improving uncertainty quantification in deep learning-based qsar regression modeling. Journal of cheminformatics, 13(1):1–17, 2021b.
- Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning, pages 1050–1059. PMLR, 2016.
- Ron Kohavi et al. Scaling up the accuracy of naive-bayes classifiers: A decision-tree hybrid. In Kdd, volume 96, pages 202–207, 1996.
- Mimic-iii, a freely accessible critical care database. Scientific data, 3(1):1–9, 2016.
- Recognizing academic performance, sleep quality, stress level, and mental health using personality traits, wearable sensors and mobile phones. In 2015 IEEE 12th International Conference on Wearable and Implantable Body Sensor Networks (BSN), pages 1–6. IEEE, 2015.
- Multitask learning and benchmarking with clinical time series data. Scientific data, 6(1):1–18, 2019.
- Identifying vulnerable older adults and legal options for increasing their protection during all-hazards emergencies: A cross-sector guide for states and communities. Atlanta: US Department of Health and Human Services, 15, 2012.
- Marital status and outcomes in patients with cardiovascular disease. Trends in cardiovascular medicine, 30(4):215–220, 2020.
- Marital status and mortality: the national longitudinal mortality study. Annals of epidemiology, 10(4):224–238, 2000.
- Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks, 5(2):157–166, 1994.
- The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology, 143(1):29–36, 1982.
- The precision–recall curve overcame the optimism of the receiver operating characteristic curve in rare diseases. Journal of clinical epidemiology, 68(8):855–859, 2015.
- Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural networks, 18(5-6):602–610, 2005.
- Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017.
- Ai fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias. IBM Journal of Research and Development, 63(4/5):4–1, 2019.
- Comparison of methods to reduce bias from clinical prediction models of postpartum depression. JAMA network open, 4(4):e213909–e213909, 2021.
- Ai fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias. arXiv preprint arXiv:1810.01943, 2018.
- Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034, 2013.
- A survey on datasets for fairness-aware machine learning. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 12(3):e1452, 2022.
- Fairness without demographics through adversarially reweighted learning. Advances in neural information processing systems, 33:728–740, 2020.
- An empirical study of rich subgroup fairness for machine learning. In Proceedings of the conference on fairness, accountability, and transparency, pages 100–109, 2019.
- Peeking into a black box, the fairness and generalizability of a mimic-iii benchmarking model. Scientific Data, 9(1):1–13, 2022.
- How do the existing fairness metrics and unfairness mitigation algorithms contribute to ethical learning analytics? British Journal of Educational Technology, 53(4):822–843, 2022.
- Christopher C Yang. Explainable artificial intelligence for predictive modeling in healthcare. Journal of healthcare informatics research, 6(2):228–239, 2022.
- Predictably unequal: understanding and addressing concerns that algorithmic clinical prediction may increase health disparities. NPJ digital medicine, 3(1):1–8, 2020.
- Glasgow coma scale. 2018.
- Association of japan coma scale score on hospital arrival with in-hospital mortality among trauma patients. BMC emergency medicine, 19(1):1–7, 2019.
- The four score predicts outcome in patients after traumatic brain injury. Neurocritical care, 16:95–101, 2012.
- In-hospital mortality and the glasgow coma scale in the first 72 hours after traumatic brain injury. Revista latino-americana de enfermagem, 19:1337–1343, 2011.
- Fraction of inspired oxygen. 2020.
- Ana Arias-Oliveras. Neonatal blood gas interpretation. Newborn and Infant Nursing Reviews, 16(3):119–121, 2016.
- Association between administered oxygen, arterial partial oxygen pressure and mortality in mechanically ventilated intensive care unit patients. Critical care, 12(6):1–8, 2008.
- Normothermia in patients with sepsis who present to emergency departments is associated with low compliance with sepsis bundles and increased in-hospital mortality rate. Critical care medicine, 48(10):1462–1470, 2020a.
- Towards fairness in visual recognition: Effective strategies for bias mitigation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8919–8928, 2020.
- Bias in clinical risk prediction models: Challenges in application to observational health data. 2020b.
- Algorithmic fairness datasets: the story so far. Data Mining and Knowledge Discovery, 36(6):2074–2152, 2022.
- Personalized multitask learning for predicting tomorrow’s mood, stress, and health. IEEE Transactions on Affective Computing, 11(2):200–213, 2017.
- API Dark Sky. Dark sky api, 2015.
- The big-five trait taxonomy: History, measurement, and theoretical perspectives. 1999.
- Building classifiers with independency constraints. In 2009 IEEE International Conference on Data Mining Workshops, pages 13–18. IEEE, 2009.
- Charles Elkan. The foundations of cost-sensitive learning. In International joint conference on artificial intelligence, volume 17, pages 973–978. Lawrence Erlbaum Associates Ltd, 2001.
- Classifying without discriminating. In 2009 2nd international conference on computer, control and communication, pages 1–6. IEEE, 2009.
- Khadija Zanna (7 papers)
- Akane Sano (27 papers)