Papers
Topics
Authors
Recent
Search
2000 character limit reached

Which Tricks Are Important for Learning to Rank?

Published 4 Apr 2022 in cs.LG | (2204.01500v2)

Abstract: Nowadays, state-of-the-art learning-to-rank methods are based on gradient-boosted decision trees (GBDT). The most well-known algorithm is LambdaMART which was proposed more than a decade ago. Recently, several other GBDT-based ranking algorithms were proposed. In this paper, we thoroughly analyze these methods in a unified setup. In particular, we address the following questions. Is direct optimization of a smoothed ranking loss preferable over optimizing a convex surrogate? How to properly construct and smooth surrogate ranking losses? To address these questions, we compare LambdaMART with YetiRank and StochasticRank methods and their modifications. We also propose a simple improvement of the YetiRank approach that allows for optimizing specific ranking loss functions. As a result, we gain insights into learning-to-rank techniques and obtain a new state-of-the-art algorithm.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. BoTorch: A framework for efficient Monte-Carlo Bayesian optimization. In Advances in Neural Information Processing Systems 33, 2020.
  2. A stochastic treatment of learning to rank scoring functions. In Proceedings of the 13th ACM International Conference on Web Search and Data Mining, pp.  61–69, 2020.
  3. Learning to rank with nonsmooth cost functions. Proceedings of the Advances in Neural Information Processing Systems, 19:193–200, 2007.
  4. Burges, C. J. C. From RankNet to LambdaRank to LambdaMART: An overview. Technical report, Microsoft Research, 2010.
  5. CatBoost. Ranking: objectives and metrics. https://catboost.ai/docs/concepts/loss-functions-ranking.html, 2023.
  6. Yahoo! learning to rank challenge overview. In Proceedings of the learning to rank challenge, pp.  1–24, 2011.
  7. Fast ranking with additive ensembles of oblivious and non-oblivious regression trees. ACM Transactions on Information Systems (TOIS), 35(2):1–31, 2016.
  8. Revisiting deep learning models for tabular data. In Advances in Neural Information Processing Systems 34, 2021.
  9. Winning the transfer learning track of Yahoo!’s learning to rank challenge with YetiRank. In Proceedings of the Learning to Rank Challenge, pp. 63–76, 2011.
  10. On optimizing top-k metrics for neural ranking models. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.  2303–2307, 2022.
  11. Net-DNF: Effective deep modeling of tabular data. In International Conference on Learning Representations, 2021.
  12. LightGBM: A highly efficient gradient boosting decision tree. In Advances in neural information processing systems, pp. 3146–3154, 2017.
  13. Liu, T.-Y. Learning to rank for information retrieval. Foundations and Trends® in Information Retrieval, 3(3):225–331, 2009.
  14. Random gradient-free minimization of convex functions. Foundations of Computational Mathematics, 17:527–566, 2017.
  15. CatBoost: unbiased boosting with categorical features. In Advances in Neural Information Processing Systems, pp. 6638–6648, 2018.
  16. Introducing LETOR 4.0 datasets. CoRR, abs/1306.2597, 2013.
  17. Are neural rankers still outperformed by gradient boosted decision trees? In International Conference on Learning Representations, 2021.
  18. StochasticRank: Global optimization of scale-free discrete functions. In International Conference on Machine Learning, pp. 9669–9679, 2020.
  19. SGLB: Stochastic Gradient Langevin Boosting. In International Conference on Machine Learning, pp. 10487–10496, 2021.
  20. The LambdaLoss framework for ranking metric optimization. In Proceedings of The 27th ACM International Conference on Information and Knowledge Management, pp.  1313–1322, 2018.
  21. Adapting boosting for information retrieval measures. Information Retrieval, 13(3):254–270, 2010.
Citations (5)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.