Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Confidence Calibration for Recommender Systems and Its Applications (2402.16325v1)

Published 26 Feb 2024 in cs.IR

Abstract: Despite the importance of having a measure of confidence in recommendation results, it has been surprisingly overlooked in the literature compared to the accuracy of the recommendation. In this dissertation, I propose a model calibration framework for recommender systems for estimating accurate confidence in recommendation results based on the learned ranking scores. Moreover, I subsequently introduce two real-world applications of confidence on recommendations: (1) Training a small student model by treating the confidence of a big teacher model as additional learning guidance, (2) Adjusting the number of presented items based on the expected user utility estimated with calibrated probability.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (109)
  1. Obtaining calibrated probabilities with personalized ranking models. In AAAI, 2022.
  2. Bpr: Bayesian personalized ranking from implicit feedback. In UAI, 2009.
  3. Predicting accurate probabilities with a ranking loss. In ICML, 2012.
  4. Where to stop reading a ranked list? threshold optimization using truncated score distributions. In SIGIR, 2009.
  5. On calibration of modern neural networks. In ICML, 2017.
  6. Beyond temperature scaling: Obtaining well-calibrated multi-class probabilities with dirichlet calibration. In NeurIPS, 2019.
  7. Intra order-preserving functions for calibration of multi-class neural networks. In NeurIPS, 2020.
  8. The isotonic regression problem and its dual. Journal of the American Statistical Association, 67(337):140–147, 1972.
  9. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers, 10(3):61–74, 1999.
  10. Recommendations as treatments: Debiasing learning and evaluation. In ICML, 2016.
  11. Yuta Saito. Unbiased pairwise learning from implicit feedback. In NeurIPS 2019 Workshop on Causal Machine Learning, 2019.
  12. Estimation of regression coefficients when some regressors are not always observed. Journal of the American statistical Association, 89(427):846–866, 1994.
  13. Cofirank-maximum margin matrix factorization for collaborative ranking. In NeurIPS, 2007.
  14. Beta calibration: a well-founded and easily implemented improvement on logistic calibration for binary classifiers. In AISTATS, 2017.
  15. Obtaining well calibrated probabilities using bayesian binning. In AAAI, 2015.
  16. Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers. In ICML, 2001.
  17. Calibrating deep neural networks using focal loss. In NeurIPS, 2020.
  18. Christoph Baumgarten. A probabilistic solution to the selection and fusion problem in distributed information retrieval. In SIGIR, 1999.
  19. John A Swets. Effectiveness of information retrieval methods. American Documentation, 20(1):72–89, 1969.
  20. Modeling score distributions for combining the outputs of search engines. In SIGIR, 2001.
  21. Score distribution models: assumptions, intuition, and robustness to score manipulation. In SIGIR, 2010.
  22. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
  23. Doubly robust joint learning for recommendation on data missing not at random. In ICML, 2019.
  24. Unbiased learning-to-rank with biased feedback. In WSDM, 2017.
  25. Paul R Rosenbaum. Overt bias in observational studies. In Observational studies, pages 71–104. Springer, 2002.
  26. Cab: Continuous adaptive blending for policy evaluation and learning. In ICML, 2019.
  27. Neural collaborative filtering. In WWW, 2017.
  28. Collaborative metric learning. In WWW, 2017.
  29. Lightgcn: Simplifying and powering graph convolution network for recommendation. In SIGIR, 2020.
  30. Being accurate is not enough: how accuracy metrics have hurt recommender systems. In CHI’06 extended abstracts on Human factors in computing systems, pages 1097–1101, 2006.
  31. Deep rating elicitation for new users in collaborative filtering. In WWW, 2020.
  32. Bidirectional distillation for top-k recommender system. In WWW, 2021.
  33. Jiaxi Tang and Ke Wang. Ranking distillation: Learning compact ranking models with high performance for recommender system. In KDD, 2018.
  34. Collaborative distillation for top-n recommendation. In ICDM, 2019.
  35. De-rrd: A knowledge distillation framework for recommender system. In CIKM, 2020.
  36. Collaborative topic regression with social regularization for tag recommendation. In IJCAI, 2013.
  37. Modeling user activity preference by leveraging user spatial temporal characteristics in lbsns. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 45(1):129–142, 2014.
  38. Deep residual learning for image recognition. In CVPR, 2016.
  39. Learning multiple layers of features from tiny images. 2009.
  40. Shallow-deep networks: Understanding and mitigating network overthinking. In ICML, 2019.
  41. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
  42. Fast matrix factorization for online recommendation with implicit feedback. In SIGIR, 2016.
  43. On sampled metrics for item recommendation. In KDD, 2020.
  44. A relaxed ranking-based factor model for recommender system from implicit feedback. In IJCAI, 2016.
  45. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems (TOIS), pages 422–446, 2002.
  46. Collaborative denoising auto-encoders for top-n recommender systems. In WSDM, 2016.
  47. Extracting and composing robust features with denoising autoencoders. In ICML, 2008.
  48. Matrix factorization techniques for recommender systems. Computer, 42(8):30–37, 2009.
  49. Pytorch: An imperative style, high-performance deep learning library. In NeurIPS, 2019.
  50. Adam: A method for stochastic optimization. In ICLR, 2015.
  51. Born again neural networks. In ICML, 2018.
  52. Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In CVPR, 2019.
  53. Discrete content-aware matrix factorization. In KDD, 2017.
  54. Discrete collaborative filtering. In SIGIR, 2016.
  55. Discrete personalized ranking for fast collaborative filtering from implicit feedback. In AAAI, 2017.
  56. Ke Zhou and Hongyuan Zha. Learning binary codes for collaborative filtering. In KDD, 2012.
  57. Candidate generation with binary codes for large-scale top-n recommendation. In CIKM, 2019.
  58. Speeding up the xbox recommender system using a euclidean transformation for inner-product spaces. In RecSys, 2014.
  59. Lemp: Fast retrieval of large entries in a matrix product. In SIGMOD, 2015.
  60. Fexipro: fast and exact inner product retrieval in recommender systems. In SIGMOD, 2017.
  61. Deep mutual learning. In CVPR, 2018.
  62. Collaborative learning for deep neural networks. In NeurIPS, 2018.
  63. Dual learning for machine translation. In NeurIPS, 2016.
  64. Performance of recommender algorithms on top-n recommendation tasks. In RecSys, 2010.
  65. Self-supervised graph learning for recommendation. In SIGIR, 2021.
  66. Learning binarized graph representations with multi-faceted quantization reinforcement for top-k recommendation. In KDD, 2022.
  67. Self-supervised hypergraph transformer for recommender systems. In KDD, 2022.
  68. Stephen E Robertson. The probability ranking principle in ir. Journal of documentation, 33(4):209–304, 1977.
  69. To swing or not to swing: learning when (not) to advertise. In CIKM, 2008.
  70. Measuring the business value of recommender systems. ACM Transactions on Management Information Systems (TMIS), 10(4):1–23, 2019.
  71. Cross-domain collaboration recommendation. In KDD, 2012.
  72. On the effectiveness of video prefetching relying on recommender systems for mobile devices. In 2016 13th IEEE Annual Consumer Communications & Networking Conference (CCNC), pages 429–434. IEEE, 2016.
  73. Fairness of exposure in rankings. In KDD, 2018.
  74. Fair ranking as fair division: Impact-based individual fairness in ranking. In KDD, 2022.
  75. Variational autoencoders for collaborative filtering. In WWW, 2018.
  76. Choppy: Cut transformer for ranked list truncation. In SIGIR, 2020.
  77. Overview of the trec 2007 legal track. In TREC, 2007.
  78. Incorporating retrieval information into the truncation of ranking lists for better legal search. In SIGIR, 2022.
  79. Learning to truncate ranked lists for information retrieval. In AAAI, 2021.
  80. Mtcut: A multi-task framework for ranked list truncation. In WSDM, 2022.
  81. An assumption-free approach to the dynamic truncation of ranked lists. In SIGIR, 2019.
  82. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
  83. Attention is all you need. In NeurIPS, 2017.
  84. Letor: A benchmark collection for research on learning to rank for information retrieval. Information Retrieval, 13(4):346–374, 2010.
  85. Ms marco: A human generated machine reading comprehension dataset. In CoCo@NeurIPS, 2016.
  86. Juan Ramos et al. Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning, volume 242, pages 29–48. Citeseer, 2003.
  87. Distributed representations of sentences and documents. In ICML, 2014.
  88. Aspect-aware latent factor model: Rating prediction with ratings and reviews. In WWW, 2018.
  89. Collaborative filtering for implicit feedback datasets. In ICDM, 2008.
  90. Unbiased recommender learning from missing-not-at-random implicit feedback. In WSDM, 2020.
  91. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018.
  92. Incorporating bias-aware margins into contrastive loss for collaborative filtering. In NeurIPS, 2022.
  93. An experimental comparison of click position-bias models. In WWW, 2008.
  94. David MW Powers. Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. Journal of Machine Learning Technologies, 2:37–63, 2011.
  95. Ranking interruptus: When truncated rankings are better and how to measure that. In SIGIR, 2022.
  96. Cross-domain recommendation: An embedding and mapping approach. In IJCAI, 2017.
  97. A dynamic model of sponsored search advertising. Marketing Science, 30(3):447–468, 2011.
  98. The knapsack problem: a survey. Naval Research Logistics Quarterly, 22(1):127–144, 1975.
  99. Patrick Billingsley. Probability and measure. John Wiley & Sons, 2008.
  100. Lucien Le Cam. An approximation theorem for the poisson binomial distribution. Pacific Journal of Mathematics, 10(4):1181–1197, 1960.
  101. Revisiting the calibration of modern neural networks. In NeurIPS, 2021.
  102. Local temperature scaling for probability calibration. In ICCV, 2021.
  103. Calibration of pre-trained transformers. In EMNLP, 2020.
  104. Justifying recommendations using distantly-labeled reviews and fine-grained aspects. In EMNLP-IJCNLP, 2019.
  105. Collaborative deep learning for recommender systems. In KDD, 2015.
  106. Collaborative variational autoencoder for recommender systems. In KDD, 2017.
  107. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In KDD, 2018.
  108. Semi-supervised classification with graph convolutional networks. In ICLR, 2017.
  109. Trends, problems and solutions of recommender system. In International conference on computing, communication & automation, pages 955–958. IEEE, 2015.

Summary

We haven't generated a summary for this paper yet.