Insights into Identifying and Correcting Label Bias in Machine Learning
This paper addresses the crucial topic of label bias in machine learning datasets, proposing a mathematical framework to detect and correct such biases. The authors conceptualize label bias as deriving from an agent's biased label assignment, despite their intention to label accurately. The crux of this work is a re-weighting approach, which trains classifiers on biased data without altering the observed labels, ultimately ensuring that the classifiers learn as if from unbiased labels. This innovative solution is tested on various datasets, demonstrating improved fairness in classification tasks.
The paper’s mathematical formulation is grounded in the assumption of an unknown unbiased label function that is corrupted by bias. With theoretical rigor, the authors present a closed-form expression for the observed label function using the KL-divergence, and introduce a method to correct bias via re-weighted data points. The re-weighting approach is theoretically robust, offering guarantees that training on the weighted dataset mirrors training on unbiased labels.
Across multiple fairness definitions—including demographic parity, equal opportunity, equalized odds, and disparate impact—the method proves practical and versatile. The authors showcase that it surpasses traditional methods like post-processing and Lagrangian approaches by better reconciling fairness with accuracy. Specifically, their approach avoids the complexity and instability associated with the Lagrangian method, which traditionally requires constraint approximations for non-convex fairness constraints.
The implications of this research are extensive, pushing forward the development of fairer machine learning models without modifying observed data—addressing potential legal concerns about data alteration. Future work could build on this framework to extend these methods to multi-label settings or explore its applicability in different data domains.
In summary, this paper presents a fundamentally sound, mathematically tested method to address fairness in machine learning by targeting label bias at its source, providing effective means to improve model equity in diverse real-world applications.