Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

User Modelling for Avoiding Overfitting in Interactive Knowledge Elicitation for Prediction (1710.04881v2)

Published 13 Oct 2017 in cs.HC, cs.LG, and stat.ML

Abstract: In human-in-the-loop machine learning, the user provides information beyond that in the training data. Many algorithms and user interfaces have been designed to optimize and facilitate this human--machine interaction; however, fewer studies have addressed the potential defects the designs can cause. Effective interaction often requires exposing the user to the training data or its statistics. The design of the system is then critical, as this can lead to double use of data and overfitting, if the user reinforces noisy patterns in the data. We propose a user modelling methodology, by assuming simple rational behaviour, to correct the problem. We show, in a user study with 48 participants, that the method improves predictive performance in a sparse linear regression sentiment analysis task, where graded user knowledge on feature relevance is elicited. We believe that the key idea of inferring user knowledge with probabilistic user models has general applicability in guarding against overfitting and improving interactive machine learning.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Pedram Daee (6 papers)
  2. Tomi Peltola (11 papers)
  3. Aki Vehtari (99 papers)
  4. Samuel Kaski (164 papers)
Citations (19)

Summary

We haven't generated a summary for this paper yet.