Towards Human-AI Complementarity with Prediction Sets (2405.17544v2)
Abstract: Decision support systems based on prediction sets have proven to be effective at helping human experts solve classification tasks. Rather than providing single-label predictions, these systems provide sets of label predictions constructed using conformal prediction, namely prediction sets, and ask human experts to predict label values from these sets. In this paper, we first show that the prediction sets constructed using conformal prediction are, in general, suboptimal in terms of average accuracy. Then, we show that the problem of finding the optimal prediction sets under which the human experts achieve the highest average accuracy is NP-hard. More strongly, unless P = NP, we show that the problem is hard to approximate to any factor less than the size of the label set. However, we introduce a simple and efficient greedy algorithm that, for a large class of expert models and non-conformity scores, is guaranteed to find prediction sets that provably offer equal or greater performance than those constructed using conformal prediction. Further, using a simulation study with both synthetic and real expert predictions, we demonstrate that, in practice, our greedy algorithm finds near-optimal prediction sets offering greater performance than conformal prediction.
- A deep learning system accurately classifies primary and metastatic cancers using passenger mutation patterns. Nature Communications, 11(1):728, 2020.
- Mooc dropout prediction: How to measure accuracy? In Proceedings of the ACM conference on learning@ scale, pages 161–164. ACM, 2017.
- Advancing mathematics by guiding human intuition with AI. Nature Communications, 600(7887):70–74, 2021.
- Regression under human assistance. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 2611–2620, 2020.
- Learning to complement humans. In Proceedings of the International Joint Conference on Artificial Intelligence. IJCAI, 2020.
- Is the most accurate ai the best teammate? optimizing ai for teamwork. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 11405–11414, 2021.
- Bayesian modeling of human–ai complementarity. Proceedings of the National Academy of Sciences, 119(11):e2111547119, 2022.
- Advancing human-ai complementarity: The impact of user expertise and algorithmic tuning on joint decision making. Transactions on Computer-Human Interaction, 30(5):1–29, 2023.
- How model accuracy and explanation fidelity influence user trust. arXiv preprint arXiv:1907.12652, 2019.
- Are explanations helpful? a comparative study of the effects of explanations in ai-assisted decision-making. In Proceedings of the International Conference on Intelligent User Interfaces, pages 318–328, 2021.
- Uncalibrated models can improve human-AI collaboration. In Advances in Neural Information Processing Systems, 2022.
- Learning human-compatible representations for case-based decision support. In Proceedings of the International Conference on Learning Representations, 2023.
- Understanding the effect of accuracy on trust in machine learning models. In Proceedings of the CHI conference on human factors in computing systems, pages 1–12, 2019.
- Effect of confidence and explanation on accuracy and trust calibration in ai-assisted decision making. In Proceedings of the Conference on Fairness, Accountability, and Transparency, pages 295–305. ACM, 2020.
- Misplaced trust: Measuring the interference of machine learning in human decision-making. In Proceedings of the ACM Conference on Web Science, pages 315–324. ACM, 2020.
- Towards a science of human-ai decision making: a survey of empirical studies. arXiv preprint arXiv:2112.11471, 2021.
- Human-aligned calibration for ai-assisted decision making. In Advances in Neural Information Processing Systems, volume 36, 2024.
- Improving expert predictions with conformal prediction. In Proceedings of the International Conference on Machine Learning, pages 32633–32653. PMLR, 2023.
- Designing decision support systems using counterfactual prediction sets. In Proceedings of the International Conference on Machine Learning. PMLR, 2024.
- Algorithmic learning in a random world, volume 29. Springer, 2005.
- Conformal prediction: A gentle introduction. Foundations and Trends® in Machine Learning, 16(4):494–591, 2023.
- Richard M. Karp. Reducibility among Combinatorial Problems, pages 85–103. Springer, 1972.
- Set-valued classification–overview via a unified framework. arXiv preprint arXiv:2102.12318, 2021.
- Cautious classification with nested dichotomies and imprecise probabilities. Soft Computing, 21(24):7447–7462, 2017.
- Efficient set-valued prediction in multi-class classification. Data Mining and Knowledge Discovery, 35(4):1435–1469, 2021.
- Partial classification in the belief function framework. Knowledge-Based Systems, 214:106742, 2021.
- Multilabel classification with partial abstention: Bayes-optimal prediction under label independence. Journal of Artificial Intelligence Research, 72:613–665, 2021.
- On the utility of prediction sets in human-ai teams. In Proceedings of the International Joint Conference on Artificial Intelligence, pages 2457–2463. IJCAI, 7 2022.
- Conformal prediction sets improve human decision making. arXiv preprint arXiv:2401.13744, 2024.
- Evaluating the utility of conformal prediction sets for ai-advised image labeling. In Proceedings of the ACM CHI conference on Human Factors in Computing Systems, 2024.
- When are two lists better than one?: Benefits and harms in joint decision-making. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 10030–10038, 2024.
- Revenue management under a general discrete choice model of consumer behavior. Management Science, 50(1):15–33, 2004.
- Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Operations research, 58(6):1666–1680, 2010.
- Assortment planning under the multinomial logit model with totally unimodular constraint structures. Work in Progress, 2013.
- Joint assortment optimization and customization under a mixture of multinomial logit models: On the value of personalized assortments. Operations research, 71(4):1197–1215, 2023.
- Rajan Udwani. Submodular order functions and assortment optimization. In Proceedings of the International Conference on Machine Learning, pages 34584–34614. PMLR, 2023.
- The algorithmic automation problem: Prediction, triage, and human effort. arXiv preprint arXiv:1903.12220, 2019.
- Consistent estimators for learning to defer to an expert. In Proceedings of the International Conference on Machine Learning, pages 7076–7087. PMLR, 2020.
- Classification under human assistance. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 5905–5913, 2021.
- Differentiable learning under triage. In Advances in Neural Information Processing Systems, volume 34, pages 9140–9151, 2021.
- Sample efficient learning of predictors that complement humans. In Proceedings of the International Conference on Machine Learning, pages 2972–3005. PMLR, 2022.
- Who should predict? exact algorithms for learning to defer to humans. In Proceedings of the International conference on artificial intelligence and statistics, pages 10520–10545. PMLR, 2023.
- Reinforcement learning under algorithmic triage. arXiv preprint arXiv:2109.11328, 2021.
- Learning to switch among agents in a team via 2-layer markov decision processes. Transactions on Machine Learning Research, 2022.
- Optimizing delegation between human and ai collaborative agents. arXiv preprint arXiv:2309.14718, 2023.
- Responsibility judgments in sequential human-ai collaboration. In Proceedings of the Annual Conference of the Cognitive Science Society, 2024.
- Auditing for human expertise. In Advances in Neural Information Processing Systems, volume 36, 2023.
- Least ambiguous set-valued classifiers with bounded error levels. Journal of the American Statistical Association, 114(525):223–234, 2019.
- Uncertainty sets for image classifiers using conformal prediction. In Proceedings of the International Conference on Learning Representations, 2021.
- Submodular function maximization. Tractability, 3(71-104):3, 2014.
- David Zuckerman. Linear degree extractors and the inapproximability of max clique and chromatic number. In Proceedings of the ACM Symposium on Theory of Computing, page 681–690. ACM, 2006.
- Top-label calibration and multiclass-to-binary reductions. In Proceedings of the International Conference on Learning Representations, 2021.
- Classification with valid and adaptive coverage. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 3581–3591. Curran Associates, Inc., 2020.
- Imagenet large scale visual recognition challenge. International journal of computer vision, 115:211–252, 2015.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- Combinatorial multi-armed bandit: General framework and applications. In Proceedings of the International conference on machine learning, pages 151–159. PMLR, 2013.
- Contextual multi-armed bandits. In Proceedings of the International conference on Artificial Intelligence and Statistics, pages 485–492. JMLR, 2010.
- Giovanni De Toni (7 papers)
- Nastaran Okati (10 papers)
- Suhas Thejaswi (13 papers)
- Eleni Straitouri (8 papers)
- Manuel Gomez-Rodriguez (40 papers)