Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Conformal Prediction via Regression-as-Classification (2404.08168v1)

Published 12 Apr 2024 in cs.LG and stat.ML

Abstract: Conformal prediction (CP) for regression can be challenging, especially when the output distribution is heteroscedastic, multimodal, or skewed. Some of the issues can be addressed by estimating a distribution over the output, but in reality, such approaches can be sensitive to estimation error and yield unstable intervals.~Here, we circumvent the challenges by converting regression to a classification problem and then use CP for classification to obtain CP sets for regression.~To preserve the ordering of the continuous-output space, we design a new loss function and make necessary modifications to the CP classification techniques.~Empirical results on many benchmarks shows that this simple approach gives surprisingly good results on many practical problems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. Distributional conformal prediction. Proceedings of the National Academy of Sciences, 118(48):e2107794118, 2021.
  2. The medical expenditure panel survey: a national information resource to support healthcare cost research and inform policy and practice. Medical Care, pp.  44–50, 2009.
  3. Concepts and applications of conformal prediction in computational drug discovery. Artificial Intelligence in Drug Discovery, pp.  63–101, 2020.
  4. Soft labels for ordinal regression. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  5. Conformal Bayesian computation. In Advances in Neural Information Processing Systems (NeurIPS), 2021.
  6. Deep ordinal regression network for monocular depth estimation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
  7. Improving uncertainty quantification of deep classifiers via neighborhood conformal prediction: Novel algorithm and theoretical analysis. arXiv preprint arXiv:2303.10694, 2023.
  8. Conformalization of sparse generalized linear models. In International Conference on Machine Learning (ICML), 2023. URL https://proceedings.mlr.press/v202/guha23b.html.
  9. Cd-split and hpd-split: Efficient conformal regions in high dimensions. Journal of Machine Learning Research, 23(87):1–32, 2022.
  10. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
  11. Evolutionary conformal prediction for breast cancer diagnosis. In 2009 9th international conference on information technology and applications in biomedicine, pp.  1–4. IEEE, 2009.
  12. Assessment of stroke risk based on morphological ultrasound image analysis with conformal prediction. In Artificial Intelligence Applications and Innovations: 6th IFIP WG 12.5 International Conference, AIAI 2010, Larnaca, Cyprus, October 6-7, 2010. Proceedings 6, pp.  146–153. Springer, 2010.
  13. Distribution-free prediction bands for non-parametric regression. Journal of the Royal Statistical Society Series B: Statistical Methodology, 76(1):71–96, 2014.
  14. Distribution-free prediction sets. Journal of the American Statistical Association, 108(501):278–287, 2013.
  15. Distribution-free predictive inference for regression. Journal of the American Statistical Association, 113(523):1094–1111, 2018.
  16. Locally valid and discriminative prediction intervals for deep learning models. Advances in Neural Information Processing Systems, 34:8378–8391, 2021.
  17. Improving trustworthiness of ai disease severity rating in medical imaging with ordinal conformal prediction sets. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pp.  545–554. Springer, 2022.
  18. Eugene Ndiaye. Stable conformal prediction sets. In International Conference on Machine Learning, pp. 16462–16479. PMLR, 2022.
  19. Root-finding approaches for computing conformal prediction set. Machine Learning, 112(1):151–176, 2023.
  20. UCI Machine Learning Repository, 2023. URL https://archive.ics.uci.edu/datasets. Accessed: September, 2023.
  21. Inductive confidence machines for regression. In Machine Learning: ECML 2002: 13th European Conference on Machine Learning Helsinki, Finland, August 19–23, 2002 Proceedings 13, pp. 345–356. Springer, 2002.
  22. Conformalized quantile regression. In Advances in Neural Information Processing Systems (NeurIPS), 2019.
  23. Dex: Deep expectation of apparent age from a single image. In IEEE International Conference on Computer Vision Workshops (ICCVW), 2015.
  24. Conditional density estimation with neural networks: Best practices and benchmarks. arXiv preprint arXiv:1903.00954, 2019.
  25. Conformal prediction using conditional histograms. In Advances in Neural Information Processing Systems (NeurIPS), 2021.
  26. Regression as classification: Influence of task formulation on neural network features. In International Conference on Artificial Intelligence and Statistics (AISTATS), 2023.
  27. Pixel recurrent neural networks. In International Conference on Machine Learning (ICML), 2016.
  28. Algorithmic learning in a random world. Springer, 2005.
  29. Graphical models, exponential families, and variational inference. Foundations and Trends® in Machine Learning, 1(1–2):1–305, 2008.
  30. Mitigating neural network overconfidence with logit normalization. In International Conference on Machine Learning, pp. 23631–23644. PMLR, 2022.
  31. Predicting conditional probability distributions: A connectionist approach. International Journal of Neural Systems, 6(02):109–118, 1995.
  32. Conformal risk control for ordinal classification. In Uncertainty in Artificial Intelligence, pp.  2346–2355. PMLR, 2023.
  33. Colorful image colorization. In European Conference on Computer Vision (ECCV), 2016.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Etash Guha (8 papers)
  2. Shlok Natarajan (2 papers)
  3. Thomas Möllenhoff (26 papers)
  4. Mohammad Emtiyaz Khan (56 papers)
  5. Eugene Ndiaye (22 papers)
Citations (9)

Summary

  • The paper transforms regression into classification by discretizing continuous outputs into bins for applying classification-based conformal prediction.
  • It introduces a novel loss function that penalizes probability mass far from the true label while using entropy regularization to preserve ordinal information.
  • Empirical tests on synthetic and real datasets show that the method yields shorter prediction intervals with maintained coverage even in complex output scenarios.

Conformal Prediction via Regression-as-Classification

This paper presents an innovative method for applying conformal prediction (CP) in regression tasks by transforming regression into a classification problem, hence the term "Regression-as-Classification" (R2CCP). The motivation for this approach stems from the challenge of CP in regression tasks, particularly when dealing with heteroscedastic, multimodal, or skewed output distributions, where traditional methods may result in unstable prediction intervals due to sensitivity to estimation errors.

Main Contributions

  1. Transformation of Regression to Classification: The core idea is to discretize the continuous output space into bins, treating each bin as a distinct class, thereby converting the regression problem into a classification task. This discretization enables the application of classification-based CP methods to regression problems.
  2. Modification of Loss Function: To address the potential loss of ordinal information inherent in the transformation, the authors propose a novel loss function. This function penalizes the allocation of probability mass far from the true label while incorporating entropy regularization to maintain flexibility in the learned distribution.
  3. Empirical Validation: The method is empirically validated on both synthetic and real datasets, demonstrating superior performance in terms of prediction interval length while maintaining the desired coverage levels. The results indicate that the approach is particularly effective in scenarios with non-trivial label noise and complex output distributions, such as heteroscedasticity and bimodality.

Detailed Methodology

  • Discretization and Loss Regularization: The output space of the regression is discretized into bins, allowing the use of classification CP techniques. The proposed loss function incorporates a distance penalty and entropy regularization to handle the discrete nature of bins without losing the inherent order of the regression problem.
  • Training Framework: The approach leverages neural networks to model the probability distribution over classes (bins) using the softened output and a specifically designed architecture that can learn complex label distributions under uncertainty.
  • Comparison with Existing Methods: The authors compare their approach with several existing CP methods, such as Conformal Quantile Regression (CQR) and Distributional Conformal Prediction (DCP). The R2CCP demonstrates consistent advantages in interval lengths without compromising on coverage guarantees, particularly in datasets with intricate label distributions.

Implications and Future Work

The proposed method not only simplifies the application of CP to regression problems by leveraging the robust algorithms established for classification but also provides a flexible framework that can be tailored for various application domains where prediction reliability is crucial. Potential future work could focus on further refining the loss function to enhance training efficiency, exploring alternative binning strategies, and extending the approach to handle multivariate outputs.

The paper makes an important contribution by bridging the gap between conformal techniques in classification and regression, offering a practical and robust solution for uncertainty quantification in predictive modeling. As the field of AI advances, the need for reliable and interpretable models in high-stakes applications will likely drive further research in this direction, potentially leading to enhanced models that can provide fine-grained uncertainty estimates across diverse application areas.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com