Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Quantifying With Only Positive Training Data (2004.10356v2)

Published 22 Apr 2020 in cs.LG and stat.ML

Abstract: Quantification is the research field that studies methods for counting the number of data points that belong to each class in an unlabeled sample. Traditionally, researchers in this field assume the availability of labelled observations for all classes to induce a quantification model. However, we often face situations where the number of classes is large or even unknown, or we have reliable data for a single class. When inducing a multi-class quantifier is infeasible, we are often concerned with estimates for a specific class of interest. In this context, we have proposed a novel setting known as One-class Quantification (OCQ). In contrast, Positive and Unlabeled Learning (PUL), another branch of Machine Learning, has offered solutions to OCQ, despite quantification not being the focal point of PUL. This article closes the gap between PUL and OCQ and brings both areas together under a unified view. We compare our method, Passive Aggressive Threshold (PAT), against PUL methods and show that PAT generally is the fastest and most accurate algorithm. PAT induces quantification models that can be reused to quantify different samples of data. We additionally introduce Exhaustive TIcE (ExTIcE), an improved version of the PUL algorithm Tree Induction for c Estimation (TIcE). We show that ExTIcE quantifies more accurately than PAT and the other assessed algorithms in scenarios where several negative observations are identical to the positive ones.

Citations (1)

Summary

We haven't generated a summary for this paper yet.