Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TWIN V2: Scaling Ultra-Long User Behavior Sequence Modeling for Enhanced CTR Prediction at Kuaishou (2407.16357v2)

Published 23 Jul 2024 in cs.IR and cs.AI

Abstract: The significance of modeling long-term user interests for CTR prediction tasks in large-scale recommendation systems is progressively gaining attention among researchers and practitioners. Existing work, such as SIM and TWIN, typically employs a two-stage approach to model long-term user behavior sequences for efficiency concerns. The first stage rapidly retrieves a subset of sequences related to the target item from a long sequence using a search-based mechanism namely the General Search Unit (GSU), while the second stage calculates the interest scores using the Exact Search Unit (ESU) on the retrieved results. Given the extensive length of user behavior sequences spanning the entire life cycle, potentially reaching up to 106 in scale, there is currently no effective solution for fully modeling such expansive user interests. To overcome this issue, we introduced TWIN-V2, an enhancement of TWIN, where a divide-and-conquer approach is applied to compress life-cycle behaviors and uncover more accurate and diverse user interests. Specifically, a hierarchical clustering method groups items with similar characteristics in life-cycle behaviors into a single cluster during the offline phase. By limiting the size of clusters, we can compress behavior sequences well beyond the magnitude of 105 to a length manageable for online inference in GSU retrieval. Cluster-aware target attention extracts comprehensive and multi-faceted long-term interests of users, thereby making the final recommendation results more accurate and diverse. Extensive offline experiments on a multi-billion-scale industrial dataset and online A/B tests have demonstrated the effectiveness of TWIN-V2. Under an efficient deployment framework, TWIN-V2 has been successfully deployed to the primary traffic that serves hundreds of millions of daily active users at Kuaishou.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. Sampling Is All You Need on Modeling Long-Term User Behaviors for CTR Prediction. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management (Atlanta, GA, USA) (CIKM ’22). Association for Computing Machinery, New York, NY, USA, 2974–2983. https://doi.org/10.1145/3511808.3557082
  2. TWIN: TWo-Stage Interest Network for Lifelong User Behavior Modeling in CTR Prediction at Kuaishou. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’23). Association for Computing Machinery, New York, NY, USA, 3785–3794. https://doi.org/10.1145/3580305.3599922
  3. End-to-End User Behavior Retrieval in Click-Through RatePrediction Model. arXiv:cs.IR/2108.04468
  4. Behavior Sequence Transformer for E-commerce Recommendation in Alibaba. CoRR abs/1905.06874 (2019). arXiv:1905.06874 http://arxiv.org/abs/1905.06874
  5. Wide & Deep Learning for Recommender Systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems (Boston, MA, USA) (DLRS 2016). Association for Computing Machinery, New York, NY, USA, 7–10. https://doi.org/10.1145/2988450.2988454
  6. Deep Neural Networks for YouTube Recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems (Boston, Massachusetts, USA) (RecSys ’16). Association for Computing Machinery, New York, NY, USA, 191–198. https://doi.org/10.1145/2959100.2959190
  7. Deep Session Interest Network for Click-through Rate Prediction. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (Macao, China) (IJCAI’19). AAAI Press, 2301–2307.
  8. DeepFM: A Factorization-Machine Based Neural Network for CTR Prediction. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (Melbourne, Australia) (IJCAI’17). AAAI Press, 1725–1731.
  9. Field-Aware Factorization Machines for CTR Prediction. In Proceedings of the 10th ACM Conference on Recommender Systems (Boston, Massachusetts, USA) (RecSys ’16). Association for Computing Machinery, New York, NY, USA, 43–50. https://doi.org/10.1145/2959100.2959134
  10. Deep Group Interest Modeling of Full Lifelong User Behaviors for CTR Prediction. arXiv:cs.IR/2311.10764
  11. Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (Ann Arbor, MI, USA) (SIGIR ’18). Association for Computing Machinery, New York, NY, USA, 1137–1140. https://doi.org/10.1145/3209978.3210104
  12. Practice on Long Sequential User Behavior Modeling for Click-Through Rate Prediction. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Anchorage, AK, USA) (KDD ’19). Association for Computing Machinery, New York, NY, USA, 2671–2679. https://doi.org/10.1145/3292500.3330666
  13. Search-Based User Interest Modeling with Lifelong Sequential Behavior Data for Click-Through Rate Prediction. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (Virtual Event, Ireland) (CIKM ’20). Association for Computing Machinery, New York, NY, USA, 2685–2692. https://doi.org/10.1145/3340531.3412744
  14. Learning to Retrieve User Behaviors for Click-through Rate Estimation. ACM Trans. Inf. Syst. 41, 4, Article 98 (apr 2023), 31 pages. https://doi.org/10.1145/3579354
  15. User Behavior Retrieval for Click-Through Rate Prediction. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, China) (SIGIR ’20). Association for Computing Machinery, New York, NY, USA, 2347–2356. https://doi.org/10.1145/3397271.3401440
  16. Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (Paris, France) (SIGIR’19). Association for Computing Machinery, New York, NY, USA, 565–574. https://doi.org/10.1145/3331184.3331230
  17. Steffen Rendle. 2010. Factorization Machines. In 2010 IEEE International Conference on Data Mining. 995–1000. https://doi.org/10.1109/ICDM.2010.127
  18. Predicting Clicks: Estimating the Click-through Rate for New Ads. In Proceedings of the 16th International Conference on World Wide Web (Banff, Alberta, Canada) (WWW ’07). Association for Computing Machinery, New York, NY, USA, 521–530. https://doi.org/10.1145/1242572.1242643
  19. AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (Beijing, China) (CIKM ’19). Association for Computing Machinery, New York, NY, USA, 1161–1170. https://doi.org/10.1145/3357384.3357925
  20. Deep & Cross Network for Ad Click Predictions. In Proceedings of the ADKDD’17 (Halifax, NS, Canada) (ADKDD’17). Association for Computing Machinery, New York, NY, USA, Article 12, 7 pages. https://doi.org/10.1145/3124749.3124754
  21. DCN V2: Improved Deep & Cross Network and Practical Lessons for Web-Scale Learning to Rank Systems. In Proceedings of the Web Conference 2021 (Ljubljana, Slovenia) (WWW ’21). Association for Computing Machinery, New York, NY, USA, 1785–1797. https://doi.org/10.1145/3442381.3450078
  22. Clustering Based Behavior Sampling with Long Sequential Data for CTR Prediction. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (¡conf-loc¿, ¡city¿Madrid¡/city¿, ¡country¿Spain¡/country¿, ¡/conf-loc¿) (SIGIR ’22). Association for Computing Machinery, New York, NY, USA, 2195–2200. https://doi.org/10.1145/3477495.3531829
  23. Deep Interest Evolution Network for Click-Through Rate Prediction. Proceedings of the AAAI Conference on Artificial Intelligence 33, 01 (Jul. 2019), 5941–5948. https://doi.org/10.1609/aaai.v33i01.33015941
  24. Deep Interest Network for Click-Through Rate Prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (London, United Kingdom) (KDD ’18). Association for Computing Machinery, New York, NY, USA, 1059–1068. https://doi.org/10.1145/3219819.3219823
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (15)
  1. Zihua Si (12 papers)
  2. Lin Guan (25 papers)
  3. Zhongxiang Sun (21 papers)
  4. Xiaoxue Zang (28 papers)
  5. Jing Lu (158 papers)
  6. Yiqun Hui (5 papers)
  7. Xingchao Cao (1 paper)
  8. Zeyu Yang (27 papers)
  9. Yichen Zheng (4 papers)
  10. Dewei Leng (6 papers)
  11. Kai Zheng (134 papers)
  12. Chenbin Zhang (11 papers)
  13. Yanan Niu (21 papers)
  14. Yang Song (299 papers)
  15. Kun Gai (125 papers)
Citations (3)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets