Assessment of Prediction Techniques: The Impact of Human Uncertainty (1702.07445v1)
Abstract: Many data mining approaches aim at modelling and predicting human behaviour. An important quantity of interest is the quality of model-based predictions, e.g. for finding a competition winner with best prediction performance. In real life, human beings meet their decisions with considerable uncertainty. Its assessment and resulting implications for statistically evident evaluation of predictive models are in the main focus of this contribution. We identify relevant sources of uncertainty as well as the limited ability of its accurate measurement, propose an uncertainty-aware methodology for more evident evaluations of data mining approaches, and discuss its implications for existing quality assessment strategies. Specifically, our approach switches from common point-paradigm to more appropriate distribution-paradigm. This is exemplified in the context of recommender systems and their established metrics of prediction quality. The discussion is substantiated by comprehensive experiments with real users, large-scale simulations, and discussion of prior evaluation campaigns (i.a. Netflix Prize) in the light of human uncertainty aspects.