Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fairness Constraints in Semi-supervised Learning (2009.06190v1)

Published 14 Sep 2020 in cs.LG and stat.ML

Abstract: Fairness in machine learning has received considerable attention. However, most studies on fair learning focus on either supervised learning or unsupervised learning. Very few consider semi-supervised settings. Yet, in reality, most machine learning tasks rely on large datasets that contain both labeled and unlabeled data. One of key issues with fair learning is the balance between fairness and accuracy. Previous studies arguing that increasing the size of the training set can have a better trade-off. We believe that increasing the training set with unlabeled data may achieve the similar result. Hence, we develop a framework for fair semi-supervised learning, which is formulated as an optimization problem. This includes classifier loss to optimize accuracy, label propagation loss to optimize unlabled data prediction, and fairness constraints over labeled and unlabeled data to optimize the fairness level. The framework is conducted in logistic regression and support vector machines under the fairness metrics of disparate impact and disparate mistreatment. We theoretically analyze the source of discrimination in semi-supervised learning via bias, variance and noise decomposition. Extensive experiments show that our method is able to achieve fair semi-supervised learning, and reach a better trade-off between accuracy and fairness than fair supervised learning.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Tao Zhang (481 papers)
  2. Tianqing Zhu (85 papers)
  3. Mengde Han (4 papers)
  4. Jing Li (621 papers)
  5. Wanlei Zhou (63 papers)
  6. Philip S. Yu (592 papers)
Citations (7)