Fairness in Machine Learning: A Survey (2010.04053v1)

Published 4 Oct 2020 in cs.LG and stat.ML

Abstract: As Machine Learning technologies become increasingly used in contexts that affect citizens, companies as well as researchers need to be confident that their application of these methods will not have unexpected social implications, such as bias towards gender, ethnicity, and/or people with disabilities. There is significant literature on approaches to mitigate bias and promote fairness, yet the area is complex and hard to penetrate for newcomers to the domain. This article seeks to provide an overview of the different schools of thought and approaches to mitigating (social) biases and increase fairness in the Machine Learning literature. It organises approaches into the widely accepted framework of pre-processing, in-processing, and post-processing methods, subcategorizing into a further 11 method areas. Although much of the literature emphasizes binary classification, a discussion of fairness in regression, recommender systems, unsupervised learning, and natural language processing is also provided along with a selection of currently available open source libraries. The article concludes by summarising open challenges articulated as four dilemmas for fairness research.

PDF Abstract

Fairness in Machine Learning: A Comprehensive Survey

The paper "Fairness in Machine Learning: A Survey" by Simon Caton from University College Dublin and Christian Haas from University of Nebraska at Omaha provides an extensive review of approaches and methodologies aimed at promoting fairness in ML. As ML systems become integral to decision-making processes that impact societal aspects such as legal decisions, recruitment, and resource allocation, ensuring these systems are free from bias and discrimination is crucial. This survey organizes the various techniques into a structured framework and highlights ongoing challenges and potential future directions in fairness research.

Overview of Fairness in ML

The paper categorizes methods for mitigating unfairness into three main strategies: pre-processing, in-processing, and post-processing. This well-accepted framework serves as a basis for discussing a wide array of techniques, both established and novel, for addressing biases in binary classification as well as extending into other ML contexts such as regression, recommendation systems, and NLP.

Methodological Components

Pre-processing: Techniques such as sampling, transformation, relabelling, and blinding aim to adjust the training data to ensure fairness before model training begins. These methods do not alter the underlying algorithms but rather the data itself.
In-processing: Approaches like adversarial learning, regularization, and constraint optimization tackle biases by incorporating fairness objectives directly into the model training process. These methods require modification of the learning algorithms and are powerful at enforcing fairness at the model level.
Post-processing: Methods such as thresholding and calibration adjust the model outputs to address potential disparities. These approaches are flexible and often do not require access to the internal workings of the model.

Extensions Beyond Binary Classification

The paper identifies the dominance of binary classification in fairness research and calls for exploration beyond this scope. It highlights work in fair regression, fairness in recommender systems, and mitigating biases in unsupervised learning and NLP, emphasizing the need to address fairness in more complex and varied ML applications.

Challenges and Future Directions

The paper concludes with a discussion of four dilemmas that frame the ongoing challenges in fairness research:

Fairness vs. Model Performance: Balancing fairness and accuracy remains a pervasive challenge, as optimizing for one often detracts from the other.
Incompatibility of Fairness Definitions: The paper outlines the difficulty in reconciling different fairness metrics and the imbalances between individual and group fairness.
Context and Policy: The necessity of aligning fairness methodologies with contextual, sociocultural, and policy-related nuances is critical for practical implementation.
Democratization of ML vs. Skills Gap: The increasing accessibility of ML tools raises concerns about the ability of less-experienced practitioners to effectively identify and mitigate biases in ML systems.

Conclusion

This survey stands as a critical resource for researchers in ML fairness, offering a detailed mapping of current methods and challenges. The discussion of practical implications and future research directions emphasizes the need for advancing fairness mechanisms that are context-aware, legally compliant, and societally aligned. As the field evolves, balancing performance with ethical considerations will be paramount, necessitating ongoing dialogue between academics, policymakers, and industry practitioners.

PDF Markdown Bookmark Chat (Pro)

Authors (2)

Simon Caton (3 papers)
Christian Haas (2 papers)

Citations (526)

View on Semantic Scholar