A Survey on Bias and Fairness in Machine Learning
Overview
The paper "A Survey on Bias and Fairness in Machine Learning," authored by Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan of USC-ISI, provides a comprehensive review of existing biases in AI systems and proposes methods to ensure fairness in ML models. This survey explores how biases can emerge in data, algorithms, and user interactions, creating unfairness in AI outcomes. It introduces a taxonomy for defining fairness in ML and presents approaches to mitigate observed biases in different AI subdomains.
Real-World Implications of Bias
The paper highlights several high-profile instances where biased AI systems led to discriminatory and unfair outcomes. For example, the COMPAS tool used for recidivism predictions in U.S. courts has shown higher false positive rates for African-American individuals compared to Caucasian individuals. Biases have also been reported in facial recognition systems and gender-based disparities in job advertisements.
Taxonomy of Bias Types
The authors categorize bias into three primary sources: data, algorithms, and user interactions. This classification elucidates the complexities of biases in AI systems. Key types of bias include:
- Data Bias: Measurement bias, omitted variable bias, representation bias, aggregation bias, sampling bias, longitudinal data fallacy, and linking bias.
- Algorithm Bias: Algorithmic bias and user interaction bias, with further subdivisions into presentation bias and ranking bias.
- User Interaction Bias: Popularity bias, emergent bias, and evaluation bias.
Fairness Definitions
Various definitions of fairness are discussed, such as:
- Equalized Odds: Ensuring true positive and false positive rates are equal across different groups.
- Equal Opportunity: Ensuring true positive rates are equal for protected and unprotected groups.
- Demographic Parity: The likelihood of positive outcomes should be independent of protected attributes.
- Fairness Through Awareness/Unawareness: Decisions should be invariant or oblivious to sensitive attributes.
- Counterfactual Fairness: Predictions should remain unchanged under hypothetical scenarios where the individual's demographic group changes.
The paper also introduces the concept of individual fairness, group fairness, and subgroup fairness, emphasizing the need for context-sensitive applications of these definitions.
Methods for Fair Machine Learning
There are three primary categories for bias mitigation methods:
- Pre-Processing: Transforming data to remove biases before model training.
- In-Processing: Adjusting learning algorithms to incorporate fairness considerations during training.
- Post-Processing: Modifying the prediction outputs to ensure fairness without altering the underlying model or data.
Key techniques and their applications across various domains are reviewed, including fair classification, regression, principal component analysis (PCA), community detection, and NLP.
Domain-Specific Mitigation Strategies
The paper surveys domain-specific methods for bias mitigation, including:
- Variational Autoencoders (VAEs): Learning fair representations by treating protected attributes as nuisance variables.
- Adversarial Learning: Utilizing adversarial networks to maximize prediction accuracy while minimizing the adversary's ability to predict sensitive variables.
- Fair NLP: Addressing biases in word embeddings, coreference resolution, LLMs, sentence encoders, machine translation, and named entity recognition (NER).
Future Research Directions
The paper identifies open challenges and potential research opportunities, such as:
- Developing a unified definition of fairness to streamline evaluation.
- Shifting the focus from equality to equity, ensuring resources are allocated based on individual or group needs.
- Designing methods to automatically detect unfairness in datasets or algorithms.
Conclusion
Overall, this survey provides a detailed examination of biases and fairness in machine learning, categorizing the sources of bias and introducing diverse definitions of fairness. It underscores the importance of context-sensitive applications of fairness principles and illustrates various methods to mitigate bias across multiple AI subdomains. This extensive review serves as a valuable resource for researchers looking to design fair AI systems and navigate the complexities of bias in machine learning.