Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Personalized Transformer-based Ranking for e-Commerce at Yandex (2310.03481v2)

Published 5 Oct 2023 in cs.IR

Abstract: Personalizing user experience with high-quality recommendations based on user activity is vital for e-commerce platforms. This is particularly important in scenarios where the user's intent is not explicit, such as on the homepage. Recently, personalized embedding-based systems have significantly improved the quality of recommendations and search in the e-commerce domain. However, most of these works focus on enhancing the retrieval stage. In this paper, we demonstrate that features produced by retrieval-focused deep learning models are sub-optimal for ranking stage in e-commerce recommendations. To address this issue, we propose a two-stage training process that fine-tunes two-tower models to achieve optimal ranking performance. We provide a detailed description of our transformer-based two-tower model architecture, which is specifically designed for personalization in e-commerce. Additionally, we introduce a novel technique for debiasing context in offline models and report significant improvements in ranking performance when using web-search queries for e-commerce recommendations. Our model has been successfully deployed at Yandex, serves millions of users daily, and has delivered strong performance in online A/B testing.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. A Zero Attention Model for Personalized Product Search. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (Beijing, China) (CIKM ’19). Association for Computing Machinery, New York, NY, USA, 379–388. https://doi.org/10.1145/3357384.3357980
  2. adSformers: Personalization from Short-Term Sequences and Diversity of Representations in Etsy Ads. arXiv:2302.01255 [cs.LG]
  3. Regression Compatible Listwise Objectives for Calibrated Ranking. arXiv:2211.01494 [cs.IR]
  4. GrokNet: Unified Computer Vision Model Trunk and Embeddings For Commerce. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Virtual Event, CA, USA) (KDD ’20). Association for Computing Machinery, New York, NY, USA, 2608–2616. https://doi.org/10.1145/3394486.3403311
  5. Unbiased Implicit Feedback via Bi-level Optimization. arXiv:2206.00147 [cs.IR]
  6. Behavior Sequence Transformer for E-Commerce Recommendation in Alibaba. In Proceedings of the 1st International Workshop on Deep Learning Practice for High-Dimensional Sparse Data (Anchorage, Alaska) (DLP-KDD ’19). Association for Computing Machinery, New York, NY, USA, Article 12, 4 pages. https://doi.org/10.1145/3326937.3341261
  7. Wide & Deep Learning for Recommender Systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems (Boston, MA, USA) (DLRS 2016). Association for Computing Machinery, New York, NY, USA, 7–10. https://doi.org/10.1145/2988450.2988454
  8. Deep Neural Networks for YouTube Recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems. New York, NY, USA.
  9. Offline Evaluation for Reinforcement Learning-Based Recommendation: A Critical Issue and Some Alternatives. SIGIR Forum 56, 2, Article 3 (jan 2023), 14 pages. https://doi.org/10.1145/3582900.3582905
  10. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv abs/1810.04805 (2019).
  11. Sequential Recommendation via Stochastic Self-Attention. In Proceedings of the ACM Web Conference 2022 (Virtual Event, Lyon, France) (WWW ’22). Association for Computing Machinery, New York, NY, USA, 2036–2047. https://doi.org/10.1145/3485447.3512077
  12. PAL: A Position-Bias Aware Learning Framework for CTR Prediction in Live Recommender Systems. In Proceedings of the 13th ACM Conference on Recommender Systems (Copenhagen, Denmark) (RecSys ’19). Association for Computing Machinery, New York, NY, USA, 452–456. https://doi.org/10.1145/3298689.3347033
  13. Improving Deep Learning for Airbnb Search. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Virtual Event, CA, USA) (KDD ’20). Association for Computing Machinery, New York, NY, USA, 2822–2830. https://doi.org/10.1145/3394486.3403333
  14. Que2Engage: Embedding-based Retrieval for Relevant and Engaging Products at Facebook Marketplace. CoRR abs/2302.11052 (2023). https://doi.org/10.48550/arXiv.2302.11052 arXiv:2302.11052
  15. Session-based Recommendations with Recurrent Neural Networks. http://arxiv.org/abs/1511.06939 cite arxiv:1511.06939Comment: Camera ready version (17th February, 2016) Affiliation update (29th March, 2016).
  16. Learning Deep Structured Semantic Models for Web Search Using Clickthrough Data. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (San Francisco, California, USA) (CIKM ’13). Association for Computing Machinery, New York, NY, USA, 2333–2338. https://doi.org/10.1145/2505515.2505665
  17. A Critical Study on Data Leakage in Recommender System Offline Evaluation. ACM Trans. Inf. Syst. 41, 3, Article 75 (feb 2023), 27 pages. https://doi.org/10.1145/3569930
  18. Unbiased Learning-to-Rank with Biased Feedback. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (Cambridge, United Kingdom) (WSDM ’17). Association for Computing Machinery, New York, NY, USA, 781–789. https://doi.org/10.1145/3018661.3018699
  19. Wang-Cheng Kang and Julian McAuley. 2018. Self-Attentive Sequential Recommendation. In 2018 IEEE International Conference on Data Mining (ICDM). 197–206. https://doi.org/10.1109/ICDM.2018.00035
  20. Walid Krichene and Steffen Rendle. 2020. On Sampled Metrics for Item Recommendation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery I& Data Mining (Virtual Event, CA, USA) (KDD ’20). Association for Computing Machinery, New York, NY, USA, 1748–1757. https://doi.org/10.1145/3394486.3403226
  21. Embedding-Based Product Retrieval in Taobao Search. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (Virtual Event, Singapore) (KDD ’21). Association for Computing Machinery, New York, NY, USA, 3181–3189. https://doi.org/10.1145/3447548.3467101
  22. Pre-trained Language Model for Web-scale Retrieval in Baidu Search. CoRR abs/2106.03373 (2021). arXiv:2106.03373 https://arxiv.org/abs/2106.03373
  23. Que2Search: Fast and Accurate Query and Document Understanding for Search at Facebook. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (Virtual Event, Singapore) (KDD ’21). Association for Computing Machinery, New York, NY, USA, 3376–3384. https://doi.org/10.1145/3447548.3467127
  24. Graph-based Multilingual Product Retrieval in E-commerce Search. arXiv:2105.02978 [cs.CL]
  25. Semantic Retrieval at Walmart. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Washington DC, USA) (KDD ’22). Association for Computing Machinery, New York, NY, USA, 3495–3503. https://doi.org/10.1145/3534678.3539164
  26. Neural Product Retrieval at Walmart.Com. In Companion Proceedings of The 2019 World Wide Web Conference (San Francisco, USA) (WWW ’19). Association for Computing Machinery, New York, NY, USA, 367–372. https://doi.org/10.1145/3308560.3316603
  27. Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781 (2013). http://dblp.uni-trier.de/db/journals/corr/corr1301.html#abs-1301-3781
  28. Semantic Product Search. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Anchorage, AK, USA) (KDD ’19). Association for Computing Machinery, New York, NY, USA, 2876–2885. https://doi.org/10.1145/3292500.3330759
  29. PinnerFormer: Sequence Modeling for User Representation at Pinterest. Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2022).
  30. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 8024–8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
  31. Deep Contextualized Word Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 1 (Long Papers), Marilyn A. Walker, Heng Ji, and Amanda Stent (Eds.). Association for Computational Linguistics, 2227–2237. https://doi.org/10.18653/v1/n18-1202
  32. CatBoost: Unbiased Boosting with Categorical Features. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (Montréal, Canada) (NIPS’18). Curran Associates Inc., Red Hook, NY, USA, 6639–6649.
  33. Alec Radford and Karthik Narasimhan. 2018. Improving Language Understanding by Generative Pre-Training.
  34. Context and Attribute-Aware Sequential Recommendation via Cross-Attention. In Proceedings of the 16th ACM Conference on Recommender Systems (Seattle, WA, USA) (RecSys ’22). Association for Computing Machinery, New York, NY, USA, 71–80. https://doi.org/10.1145/3523227.3546777
  35. DeepSpeed: System Optimizations Enable Training Deep Learning Models with Over 100 Billion Parameters. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Virtual Event, CA, USA) (KDD ’20). Association for Computing Machinery, New York, NY, USA, 3505–3506. https://doi.org/10.1145/3394486.3406703
  36. Scaling Law for Recommendation Models: Towards General-purpose User Representations. arXiv:2111.11294 [cs.IR]
  37. Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning. arXiv:2212.03760 [cs.IR]
  38. Sequential Modeling with Multiple Attributes for Watchlist Recommendation in E-Commerce. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining (Virtual Event, AZ, USA) (WSDM ’22). Association for Computing Machinery, New York, NY, USA, 937–946. https://doi.org/10.1145/3488560.3498453
  39. Aixin Sun. 2023. Take a Fresh Look at Recommender Systems from an Evaluation Standpoint. arXiv:2210.04149 [cs.IR]
  40. BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (Beijing, China) (CIKM ’19). Association for Computing Machinery, New York, NY, USA, 1441–1450. https://doi.org/10.1145/3357384.3357895
  41. Jiaxi Tang and Ke Wang. 2018. Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (Marina Del Rey, CA, USA) (WSDM ’18). Association for Computing Machinery, New York, NY, USA, 565–573. https://doi.org/10.1145/3159652.3159656
  42. MSURU: Large Scale E-Commerce Image Classification with Weakly Supervised Search Data. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Anchorage, AK, USA) (KDD ’19). Association for Computing Machinery, New York, NY, USA, 2518–2526. https://doi.org/10.1145/3292500.3330696
  43. Attention is All you Need. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
  44. Learning Multi-Stage Multi-Grained Semantic Embeddings for E-Commerce Search. CoRR abs/2303.11009 (2023). https://doi.org/10.48550/arXiv.2303.11009 arXiv:2303.11009
  45. Personalized Embedding-based e-Commerce Recommendations at eBay. arXiv:2102.06156 [cs.IR]
  46. A Multi-task Learning Framework for Product Ranking with BERT. In Proceedings of the ACM Web Conference 2022. ACM. https://doi.org/10.1145/3485447.3511977
  47. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. CoRR abs/1609.08144 (2016). http://dblp.uni-trier.de/db/journals/corr/corr1609.html#WuSCLNMKCGMKSJL16
  48. How Pinterest Leverages Realtime User Actions in Recommendation to Boost Homefeed Engagement Volume. Retrieved March 2, 2023 from https://medium.com/pinterest-engineering/how-pinterest-leverages-realtime-user-actions-in-recommendation-to-boost-homefeed-engagement-volume-165ae2e8cde8
  49. Contrastive learning for sequential recommendation. In 2022 IEEE 38th international conference on data engineering (ICDE). IEEE, 1259–1273.
  50. Mixed Negative Sampling for Learning Two-Tower Neural Networks in Recommendations. In Companion Proceedings of the Web Conference 2020 (Taipei, Taiwan) (WWW ’20). Association for Computing Machinery, New York, NY, USA, 441–447. https://doi.org/10.1145/3366424.3386195
  51. Sampling-Bias-Corrected Neural Modeling for Large Corpus Item Recommendations. In Proceedings of the 13th ACM Conference on Recommender Systems (Copenhagen, Denmark) (RecSys ’19). Association for Computing Machinery, New York, NY, USA, 269–277. https://doi.org/10.1145/3298689.3346996
  52. Towards Personalized and Semantic Retrieval: An End-to-End Solution for E-Commerce Search via Embedding Learning. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, China) (SIGIR ’20). Association for Computing Machinery, New York, NY, USA, 2407–2416. https://doi.org/10.1145/3397271.3401446
  53. Uni-Retriever: Towards Learning the Unified Embedding Based Retriever in Bing Sponsored Search. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Washington DC, USA) (KDD ’22). Association for Computing Machinery, New York, NY, USA, 4493–4501. https://doi.org/10.1145/3534678.3539212
  54. Towards Disentangling Relevance and Bias in Unbiased Learning to Rank. arXiv:2212.13937 [cs.IR]
  55. Recommending What Video to Watch next: A Multitask Ranking System. In Proceedings of the 13th ACM Conference on Recommender Systems (Copenhagen, Denmark) (RecSys ’19). Association for Computing Machinery, New York, NY, USA, 43–51. https://doi.org/10.1145/3298689.3346997
  56. Page-Wise Personalized Recommendations in an Industrial e-Commerce Setting. In Proceedings of the 5th Workshop on Online Recommender Systems and User Modeling co-located with the 16th ACM Conference on Recommender Systems, ORSUM@RecSys 2022, Seattle, WA, USA, September 23rd, 2022 (CEUR Workshop Proceedings, Vol. 3303), João Vinagre, Marie Al-Ghossein, Alípio Mário Jorge, Albert Bifet, and Ladislav Peska (Eds.). CEUR-WS.org. https://ceur-ws.org/Vol-3303/paper2.pdf
  57. Delving into E-Commerce Product Retrieval with Vision-Language Pre-training. CoRR abs/2304.04377 (2023). https://doi.org/10.48550/arXiv.2304.04377 arXiv:2304.04377
  58. MAKE: Product Retrieval with Vision-Language Pre-training in Taobao Search. CoRR abs/2301.12646 (2023). https://doi.org/10.48550/arXiv.2301.12646 arXiv:2301.12646
  59. Multi-Objective Personalized Product Retrieval in Taobao Search. ArXiv abs/2210.04170 (2022).
  60. Deep Interest Evolution Network for Click-through Rate Prediction. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence (Honolulu, Hawaii, USA) (AAAI’19/IAAI’19/EAAI’19). AAAI Press, Article 729, 8 pages. https://doi.org/10.1609/aaai.v33i01.33015941
  61. Deep Interest Network for Click-Through Rate Prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (London, United Kingdom) (KDD ’18). Association for Computing Machinery, New York, NY, USA, 1059–1068. https://doi.org/10.1145/3219819.3219823

Summary

We haven't generated a summary for this paper yet.