Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Position Bias Estimation with Item Embedding for Sparse Dataset (2305.13931v3)

Published 10 May 2023 in cs.IR

Abstract: Estimating position bias is a well-known challenge in Learning to Rank (L2R). Click data in e-commerce applications, such as targeted advertisements and search engines, provides implicit but abundant feedback to improve personalized rankings. However, click data inherently includes various biases like position bias. Based on the position-based click model, Result Randomization and Regression Expectation-Maximization algorithm (REM) have been proposed to estimate position bias, but they require various paired observations of (item, position). In real-world scenarios of advertising, marketers frequently display advertisements in a fixed pre-determined order, which creates difficulties in estimation due to the limited availability of various pairs in the training data, resulting in a sparse dataset. We propose a variant of the REM that utilizes item embeddings to alleviate the sparsity of (item, position). Using a public dataset and internal carousel advertisement click dataset, we empirically show that item embedding with Latent Semantic Indexing (LSI) and Variational Auto-Encoder (VAE) improves the accuracy of position bias estimation and the estimated position bias enhances Learning to Rank performance. We also show that LSI is more effective as an embedding creation method for position bias estimation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (14)
  1. Carousel ads optimization in yahoo gemini native. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1993–2001.
  2. Grigor Aslanyan and Utkarsh Porwal. 2019. Position Bias Estimation for Unbiased Learning-to-Rank in eCommerce Search. Springer International Publishing, 47–64. https://doi.org/10.1007/978-3-030-32686-9_4
  3. Learning to rank: from pairwise approach to listwise approach. In Proceedings of the 24th international conference on Machine learning. 129–136.
  4. An experimental comparison of click position-bias models. In Proceedings of the 2008 international conference on web search and data mining. 87–94.
  5. Accurately interpreting clickthrough data as implicit feedback. In Acm Sigir Forum, Vol. 51. Acm New York, NY, USA, 4–11.
  6. An introduction to variational autoencoders. Foundations and Trends® in Machine Learning 12, 4 (2019), 307–392.
  7. Unbiased Learning to Rank with Biased Continuous Feedback. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management (CIKM ’22). ACM. https://doi.org/10.1145/3511808.3557483
  8. Learning optimal ranking with tensor factorization for tag recommendation. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. 727–736.
  9. Predicting clicks: estimating the click-through rate for new ads. In Proceedings of the 16th international conference on World Wide Web. 521–530.
  10. Barbara Rosario. 2000. Latent semantic indexing: An overview. Techn. rep. INFOSYS 240 (2000), 1–16.
  11. Yuta Saito and Thorsten Joachims. 2022. Off-Policy Evaluation for Large Action Spaces via Embeddings. arXiv preprint arXiv:2202.06317 (2022).
  12. Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation. arXiv preprint arXiv:2008.07146 (2020).
  13. Position bias estimation for unbiased learning to rank in personal search. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. 610–618.
  14. Boostfm: Boosted factorization machines for top-n feature-based recommendation. In Proceedings of the 22nd International Conference on Intelligent User Interfaces. 45–54.

Summary

We haven't generated a summary for this paper yet.