Model positive and unlabeled data with a generalized additive density ratio model (2508.12446v1)
Abstract: We address learning from positive and unlabeled (PU) data, a common setting in which only some positives are labeled and the rest are mixed with negatives. Classical exponential tilting models guarantee identifiability by assuming a linear structure, but they can be badly misspecified when relationships are nonlinear. We propose a generalized additive density-ratio framework that retains identifiability while allowing smooth, feature-specific effects. The approach comes with a practical fitting algorithm and supporting theory that enables estimation and inference for the mixture proportion and other quantities of interest. In simulations and analyses of benchmark datasets, the proposed method matches the standard exponential tilting method when the linear model is correct and delivers clear gains when it is not. Overall, the framework strikes a useful balance between flexibility and interpretability for PU learning and provides principled tools for estimation, prediction, and uncertainty assessment.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.