Conformal Prediction with Learned Features (2404.17487v1)

Published 26 Apr 2024 in cs.LG, cs.AI, and stat.ML

Abstract: In this paper, we focus on the problem of conformal prediction with conditional guarantees. Prior work has shown that it is impossible to construct nontrivial prediction sets with full conditional coverage guarantees. A wealth of research has considered relaxations of full conditional guarantees, relying on some predefined uncertainty structures. Departing from this line of thinking, we propose Partition Learning Conformal Prediction (PLCP), a framework to improve conditional validity of prediction sets through learning uncertainty-guided features from the calibration data. We implement PLCP efficiently with alternating gradient descent, utilizing off-the-shelf machine learning models. We further analyze PLCP theoretically and provide conditional guarantees for infinite and finite sample sizes. Finally, our experimental results over four real-world and synthetic datasets show the superior performance of PLCP compared to state-of-the-art methods in terms of coverage and length in both classification and regression scenarios.

Citations (3)

View on Semantic Scholar

Summary

The paper introduces PLCP, which learns uncertainty-guided features to construct prediction sets with improved conditional validity.
It utilizes alternating gradient descent with off-the-shelf models to optimize data-driven partitions of the calibration set.
Theoretical analysis and experiments show that PLCP effectively reduces the Mean Squared Conditional Error, outperforming traditional methods.

Conformal Prediction with Learned Features: Enhancing Conditional Validity through Partition Learning Conformal Prediction (PLCP)

Introduction

The paper introduces a new framework, Partition Learning Conformal Prediction (PLCP), designed to improve the conditional validity of prediction sets in conformal prediction by learning uncertainty-guided features from calibration data. Traditional conformal prediction methods typically ensure only marginal coverage guarantees, which may not suffice for providing valid prediction sets across all subgroups or under changing conditions. By contrast, PLCP aims to address these limitations by iteratively learning a partitioning from the calibration data, which groups points with similar uncertainties together, resulting in prediction sets that are tailored to these groups.

Main Contributions

Methodological: The PLCP framework departs from predefined uncertainty structures, opting instead to learn these from the data, addressing how different the conditional distributions are for each point in the input space.
Theoretical Analysis: The paper provides a detailed theoretical analysis of PLCP, establishing its effectiveness through bounds on the Mean Squared Conditional Error (MSCE), indicating how well the coverage of the constructed prediction sets approximates the ideal conditional coverage.
Experimental Validation: PLCP is extensively tested against state-of-the-art methods on four datasets, demonstrating superior performance in terms of coverage and interval length, particularly in distinction to methods relying on preset group structures or covariate shifts.

Algorithmic Framework

PLCP operates by alternating between updating partitioning and prediction sets, optimizing an objective function that measures the discrepancy between the conditional quantiles of the points within a partition. This process utilizes a user-specified function class (e.g., linear models or neural networks) to facilitate partitioning, and is efficiently implemented using alternating gradient descent techniques. This integration allows PLCP to adeptly leverage off-the-shelf machine learning models for enhancing prediction set construction.

Theoretical Insights

The theoretical framework introduces and leverages the concept of MSCE, which quantifies the deviation of prediction sets from fully conditional coverage. The analysis reveals that:

Infinite Data Regime: The MSCE for PLCP's predictions scales favorably with the number of partitions, suggesting improved conditional coverage as more partitions are used.
Finite Sample Guarantee: Delivers insights into how PLCP manages the trade-off between the number of partitions and the available data, offering practical guidelines for setting the number of partitions to balance model complexity and coverage accuracy.

Implications and Future Directions

The PLCP framework sets a compelling direction for future research in conformal prediction, particularly in its ability to learn and adapt to the inherent uncertainty structure of data without prior knowledge of group identities or covariate shifts. This adaptability makes it particularly viable for complex real-world scenarios where the distribution of data or its underlying uncertainty is not thoroughly understood. Future developments could explore deeper integration with other machine learning advancements, further refinement of the theoretical underpinnings, and expansion into more varied application domains.

Conclusion

PLCP provides a robust methodological advancement in conformal prediction, notably enhancing the conditional validity of prediction sets by learning relevant features directly from data. Its strong empirical performance, backed by rigorous theoretical guarantees, marks it as a significant contribution to the field, with promising implications for both theoretical exploration and practical application in scenarios requiring reliable uncertainty quantification.

PDF Markdown

Related Papers

Tweets

https://twitter.com/StatMLPapers/status/1784795889078337892