Two-stage Conformal Risk Control with Application to Ranked Retrieval (2404.17769v2)
Abstract: Many practical machine learning systems, such as ranking and recommendation systems, consist of two concatenated stages: retrieval and ranking. These systems present significant challenges in accurately assessing and managing the uncertainty inherent in their predictions. To address these challenges, we extend the recently developed framework of conformal risk control, originally designed for single-stage problems, to accommodate the more complex two-stage setup. We first demonstrate that a straightforward application of conformal risk control, treating each stage independently, may fail to maintain risk at their pre-specified levels. Therefore, we propose an integrated approach that considers both stages simultaneously, devising algorithms to control the risk of each stage by jointly identifying thresholds for both stages. Our algorithm further optimizes for a weighted combination of prediction set sizes across all feasible thresholds, resulting in more effective prediction sets. Finally, we apply the proposed method to the critical task of two-stage ranked retrieval. We validate the efficacy of our method through extensive experiments on two large-scale public datasets, MSLR-WEB and MS MARCO, commonly used for ranked retrieval tasks.
- Recommendation systems with distribution-free reliability guarantees. In Symposium on Conformal and Probabilistic Prediction with Applications (COPA), 2023, 2023.
- A gentle introduction to conformal prediction and distribution-free uncertainty quantification. arXiv:2107.07511, 2021.
- Learn then test: Calibrating predictive algorithms to achieve risk control. arXiv preprint arXiv:2110.01052, 2021.
- Conformal risk control. ICLR, 2024.
- R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. ACM Press / Addison-Wesley, 1999.
- MS MARCO: A human generated machine reading comprehension dataset. arXiv preprint arXiv:1611.09268, 2016.
- Learning to rank using gradient descent. In Proceedings of the 22nd international conference on Machine learning, 2005.
- Learning to rank with nonsmooth cost functions. In In Proceedings of NIPS conference, 2006.
- Learning to rank: From pairwise approach to listwise approach. In MSR-TR-2007-40, 2007.
- Olivier Chapelle and Yi Chang. Yahoo! learning to rank challenge overview. In Proceedings of the Learning to Rank Challenge, volume 14 of Proceedings of Machine Learning Research. PMLR, 2011.
- W. Chu and Z. Ghahramani. Preference learning with gaussian processes. In Proceedings of the 22nd international conference on Machine learning, 2005.
- Pranking with ranking. In In Proceedings of NIPS conference, 2001.
- An efficient boosting algorithm for combining preferences. In Journal of Machine Learning Research, 2003.
- A deep relevance matching model for ad-hoc retrieval. In Proceedings of the 39th International ACM SIGIR conference, 2016.
- IR evaluation methods for retrieving highly relevant documents. Proceedings of the 23rd international ACM SIGIR conference, 2000.
- Finding the best of both worlds: Faster and more robust top-k document retrieval. Proceedings of the 43rd International ACM SIGIR Conference, 2020.
- A conformal prediction approach to explore functional data. Annals of Mathematics and Artificial Intelligence, 2015.
- Tie-Yan Liu. Learning to rank for information retrieval. Proceedings of the 33rd international ACM SIGIR conference, 2009.
- Inductive confidence machines for regression. In ECML, 2002.
- Introducing LETOR 4.0 datasets. CoRR, abs/1306.2597, 2013. URL http://arxiv.org/abs/1306.2597.
- Learning to rank short text pairs with convolutional deep neural networks. In Proceedings of the 38th International ACM SIGIR Conference, 2015.
- Robertson. Stephen and Jones. K., Sparck. Relevance weighting of search terms. journal of the association for information science and technology. 27(3):129-146. doi: 10.1002/ASI.4630270302, 1976.
- Machine-learning applications of algorithmic randomness. Sixteenth International Conference on Machine Learning (ICML-1999), 1999.
- Algorithmic learning in a random world, volume 29. Springer, 2005.
- Text embeddings by weakly-supervised contrastive pre-training. arXiv preprint arXiv:2212.03533, 2022.
- Dawei Yin and et al. Ranking relevance in Yahoo search. Proceedings of the ACM SIGKDD Conference, 2016.
- Hai-Tao Yu. PT-Ranking: A benchmarking platform for neural learning-to-rank, 2020.