Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An Empirical Study of Training ID-Agnostic Multi-modal Sequential Recommenders (2403.17372v5)

Published 26 Mar 2024 in cs.IR

Abstract: Sequential Recommendation (SR) aims to predict future user-item interactions based on historical interactions. While many SR approaches concentrate on user IDs and item IDs, the human perception of the world through multi-modal signals, like text and images, has inspired researchers to delve into constructing SR from multi-modal information without using IDs. However, the complexity of multi-modal learning manifests in diverse feature extractors, fusion methods, and pre-trained models. Consequently, designing a simple and universal \textbf{M}ulti-\textbf{M}odal \textbf{S}equential \textbf{R}ecommendation (\textbf{MMSR}) framework remains a formidable challenge. We systematically summarize the existing multi-modal related SR methods and distill the essence into four core components: visual encoder, text encoder, multimodal fusion module, and sequential architecture. Along these dimensions, we dissect the model designs, and answer the following sub-questions: First, we explore how to construct MMSR from scratch, ensuring its performance either on par with or exceeds existing SR methods without complex techniques. Second, we examine if MMSR can benefit from existing multi-modal pre-training paradigms. Third, we assess MMSR's capability in tackling common challenges like cold start and domain transferring. Our experiment results across four real-world recommendation scenarios demonstrate the great potential ID-agnostic multi-modal sequential recommendation. Our framework can be found at: https://github.com/MMSR23/MMSR.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Layer Normalization. arXiv preprint arXiv:1607.06450 (2016).
  2. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. CoRR abs/1803.01271 (2018).
  3. Deep Learning for AI. Commun. ACM 64 (2021), 58 – 65.
  4. A Simple Framework for Contrastive Learning of Visual Representations. In ICML. 1597–1607.
  5. Improved Baselines with Momentum Contrastive Learning. arXiv preprint arXiv:2003.04297 (2020).
  6. An Empirical Study of Training Self-Supervised Vision Transformers. In ICCV. 9620–9629.
  7. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. CoRR abs/1412.3555.
  8. Deep Neural Networks for YouTube Recommendations. RecSys, 191–198.
  9. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL. 4171–4186.
  10. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ICLR (2021).
  11. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In EMNLP. 6894–6910.
  12. Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning. ICLR.
  13. Intelligent Career Planning via Stochastic Subsampling Reinforcement Learning. Scientific Reports (2022), 16 pages.
  14. Route Optimization via Environment-Aware Deep Network and Reinforcement Learning. ACM Trans. Intell. Syst. Technol. (TIST) 12, 6 (2021), 21 pages.
  15. F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems (2015), 1–19.
  16. Momentum Contrast for Unsupervised Visual Representation Learning. In CVPR. 9726–9735.
  17. Deep Residual Learning for Image Recognition. In CVPR. 770–778.
  18. Ruining He and Julian McAuley. 2016. Fusing Similarity Models with Markov Chains for Sparse Sequential Recommendation. In ICDM. 191–200.
  19. Balázs Hidasi and Alexandros Karatzoglou. 2018. Recurrent Neural Networks with Top-k Gains for Session-based Recommendations. In CIKM. 843–852.
  20. Session-based Recommendations with Recurrent Neural Networks. ICLR.
  21. Wang-Cheng Kang and Julian McAuley. 2018. Self-Attentive Sequential Recommendation. In ICDM. 197–206.
  22. Diederik P Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980 (2014).
  23. Walid Krichene and Steffen Rendle. 2020. On Sampled Metrics for Item Recommendation. In SIGKDD. 1748–1757.
  24. Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language Model. In IJCAI. 3672–3678.
  25. R-Drop: Regularized Dropout for Neural Networks. In NeurIPS 2021. 10890–10905.
  26. Exploiting Aesthetic Preference in Deep Cross Networks for Cross-Domain Recommendation. In The Web Conference. 2768–2774.
  27. Self-supervised Learning: Generative or Contrastive. TKDE (2021).
  28. Contrastive Self-supervised Sequential Recommendation with Robust Augmentation. arXiv preprint arXiv:2108.06479.
  29. Ilya Loshchilov and Frank Hutter. 2017. Decoupled Weight Decay Regularization. arXiv preprint arXiv:1711.05101 (2017).
  30. Image-based Recommendations on Styles and Substitutes. In SIGIR. 43–52.
  31. Representation Learning with Contrastive Predictive Coding. arXiv preprint arXiv:1807.03748 (2018).
  32. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In NeurIPS. 8024–8035.
  33. Memory Augmented Multi-Instance Contrastive Predictive Coding for Sequential Recommendation. ICDM (2021).
  34. Contrastive Learning for Representation Degeneration Problem in Sequential Recommendation. In WSDM. 813–823.
  35. Personalizing Session-Based Recommendations with Hierarchical Recurrent Neural Networks. In RecSys. 130–137.
  36. Factorizing personalized Markov chains for next-basket recommendation. In WWW. 811–820.
  37. An MDP-Based Recommender System. In Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI2002).
  38. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. JMLR, 1929–1958.
  39. BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer. In CIKM. 1441–1450.
  40. Jiaxi Tang and Ke Wang. 2018. Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding. In WSDM. 565–573.
  41. Attention is All you Need. In NeurIPS. 6000–6010.
  42. Feng Wang and Huaping Liu. 2021. Understanding the Behaviour of Contrastive Loss. In CVPR. 2495–2504.
  43. Tongzhou Wang and Phillip Isola. 2020. Understanding Contrastive Representation Learning Through Alignment and Uniformity on the Hypersphere. In ICML. 9929–9939.
  44. Session-Based Recommendation with Graph Neural Networks. In AAAI. 8 pages.
  45. Contrastive Learning for Sequential Recommendation. arXiv preprint arXiv:2010.14395.
  46. Self-Supervised Reinforcement Learning for Recommender Systems. In SIGIR. 931–940.
  47. Long- and short-term self-attention network for sequential recommendation. Neurocomputing 423 (2021), 580–589.
  48. Graph Contextualized Self-Attention Network for Session-based Recommendation. In IJCAI. 3940–3946.
  49. Feature-level Deeper Self-Attention Network for Sequential Recommendation. In IJCAI. 4320–4326.
  50. S3-Rec: Self-Supervised Learning for Sequential Recommendation with Mutual Information Maximization. In CIKM. 1893–1902.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Youhua Li (8 papers)
  2. Hanwen Du (11 papers)
  3. Yongxin Ni (15 papers)
  4. Yuanqi He (1 paper)
  5. Junchen Fu (14 papers)
  6. Xiangyan Liu (10 papers)
  7. Qi Guo (237 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.