Unbiased Estimations based on Binary Classifiers: A Maximum Likelihood Approach

Published 17 Feb 2021 in stat.ML and cs.LG | (2102.08659v1)

Abstract: Binary classifiers trained on a certain proportion of positive items introduce a bias when applied to data sets with different proportions of positive items. Most solutions for dealing with this issue assume that some information on the latter distribution is known. However, this is not always the case, certainly when this proportion is the target variable. In this paper a maximum likelihood estimator for the true proportion of positives in data sets is suggested and tested on synthetic and real world data.