General Item Representation Learning for Cold-start Content Recommendations (2404.13808v1)
Abstract: Cold-start item recommendation is a long-standing challenge in recommendation systems. A common remedy is to use a content-based approach, but rich information from raw contents in various forms has not been fully utilized. In this paper, we propose a domain/data-agnostic item representation learning framework for cold-start recommendations, naturally equipped with multimodal alignment among various features by adopting a Transformer-based architecture. Our proposed model is end-to-end trainable completely free from classification labels, not just costly to collect but suboptimal for recommendation-purpose representation learning. From extensive experiments on real-world movie and news recommendation benchmarks, we verify that our approach better preserves fine-grained user taste than state-of-the-art baselines, universally applicable to multiple domains at large scale.
- MovieLens 20M YouTube Trailers Dataset.
- VATT: Transformers for multimodal self-supervised learning from raw video, audio and text.
- Neural news recommendation with long-and short-term user representations. In Proc. of the Annual Meeting of the Association for Computational Linguistics (ACL).
- ViViT: A video vision transformer. In Proc. of the IEEE/CVF Conference on International Conference on Computer Vision (ICCV).
- Cold item integration in deep hybrid recommenders via tunable stochastic gates. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE, 994–999.
- CB2CF: a neural multiview content-to-collaborative filtering model for completely cold item recommendations. In Proceedings of the 13th ACM Conference on Recommender Systems. 228–236.
- Is space-time attention all you need for video understanding?. In Proc. of the International Conference on Machine Learning (ICML).
- Empirical analysis of predictive algorithms for collaborative filtering. In Proc. of the Conference on Uncertainty in Artificial Intelligence (UAI).
- Joao Carreira and Andrew Zisserman. 2017. Quo vadis, action recognition? a new model and the kinetics dataset. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- A simple framework for contrastive learning of visual representations. In Proc. of the International Conference on Machine Learning (ICML).
- InfoXLM: An information-theoretic framework for cross-lingual language model pre-training. (2021).
- Movie genome: alleviating new item cold start in movie recommendation. User Modeling and User-Adapted Interaction 29, 2 (2019), 291–343.
- Recommender systems leveraging multimedia content. ACM Computing Surveys (CSUR) 53, 5 (2020), 1–38.
- ImageNet: A large-scale hierarchical image database. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proc. of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL).
- Mamo: Memory-augmented meta-optimization for cold-start recommendation. In Proc. of the ACM SIGKDD International conference on knowledge discovery & data mining.
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proc. of the International Conference on Learning Representations (ICLR).
- How to learn item representation for cold-start multimedia recommendation?. In Proc. of the ACM International Conference on Multimedia.
- Learning image and user features for recommendation in social networks. In Proc. of the IEEE/CVF Conference on International Conference on Computer Vision (ICCV).
- Using collaborative filtering to weave an information tapestry. Commun. ACM 35, 12 (1992), 61–70.
- AST: Audio spectrogram transformer. arXiv:2104.01778 (2021).
- Bootstrap your own latent-a new approach to self-supervised learning. Advances in Neural Information Processing Systems (NIPS) 33 (2020).
- F Maxwell Harper and Joseph A Konstan. 2015. The Movielens datasets: History and context. ACM Transactions on interactive intelligent systems (TIIS) 5, 4 (2015), 1–19.
- Momentum contrast for unsupervised visual representation learning. In CVPR.
- Neural collaborative filtering. In Proc. of the ACM International Conference on World Wide Web (WWW).
- Learning deep representations by mutual information estimation and maximization. In Proc. of the International Conference on Learning Representations (ICLR).
- MuLan: A joint embedding of music audio and natural language. In Proc. of the International Society for Music Information Retrieval Conference (ISMIR).
- MovieNet: A holistic dataset for movie understanding. In Proc. of the European Conference on Computer Vision (ECCV).
- Large-scale training framework for video annotation. In Proc. of the ACM SIGKDD International conference on knowledge discovery & data mining.
- Korean bert pre-trained cased (kobert). URL https://github. com/SKTBrain/KoBERT (2019).
- Transformer VAE: A hierarchical model for structure-aware and interpretable music representation learning. In Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
- Supervised contrastive learning. Advances in Neural Information Processing Systems (NIPS) 33 (2020).
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
- Michal Kompan and Mária Bieliková. 2010. Content-based news recommendation. In International conference on electronic commerce and web technologies. Springer.
- Contrastive representation learning: A framework and review. IEEE Access 8 (2020), 193907–193934.
- Melu: Meta-learned user preference estimator for cold-start recommendation. In Proc. of the ACM SIGKDD International conference on knowledge discovery & data mining.
- Large Scale Video Representation Learning via Relational Graph Clustering. In Proc. of the ACM SIGKDD International conference on knowledge discovery & data mining.
- Collaborative deep metric learning for video understanding. In Proc. of the ACM SIGKDD International conference on knowledge discovery & data mining.
- Local collaborative ranking. In Proc. of the ACM International Conference on World Wide Web (WWW).
- Local Low-Rank Matrix Approximation. In Proc. of the International Conference on Machine Learning (ICML).
- Recommender systems with heterogeneous side information. In Proc. of the ACM International Conference on World Wide Web (WWW).
- Pre-training graph transformer with multimodal side information for recommendation. In Proc. of the ACM International Conference on Multimedia (MM).
- Swin transformer: Hierarchical vision transformer using shifted windows. In Proc. of the IEEE/CVF Conference on International Conference on Computer Vision (ICCV).
- Contrastive learning for recommender system. arXiv:2101.01317 (2021).
- VilBERT: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. Advances in Neural Information Processing Systems (NIPS) 32 (2019).
- Meta-learning on heterogeneous information networks for cold-start recommendation. In Proc. of the ACM SIGKDD International conference on knowledge discovery & data mining.
- A tensorized transformer for language modeling. In Advances in Neural Information Processing Systems (NIPS).
- HowTo100M: Learning a text-video embedding by watching hundred million narrated video clips. In Proc. of the IEEE/CVF Conference on International Conference on Computer Vision (ICCV).
- Xia Ning and George Karypis. 2012. Sparse linear methods with side information for top-n recommendations. In Proc. of the ACM Conference on Recommender Systems (RecSys).
- Warm up cold-start advertisements: Improving ctr predictions via learning to learn id embeddings. In Proc. of the International ACM Conference on Research and Development in Information Retrieval (SIGIR).
- Representation learning of music using artist labels. arXiv:1710.06648 (2017).
- BPR: Bayesian personalized ranking from implicit feedback. In Proc. of the Conference on Uncertainty in Artificial Intelligence (UAI).
- On multi-component rating and collaborative filtering for recommender systems: The case of yahoo! movies. Information Systems Research (2008).
- Collaborative filtering recommender systems. In The adaptive web. Springer.
- Adaptive feature sampling for recommendation with missing content feature values. In Proc. of the ACM International Conference on Information and Knowledge Management (CIKM).
- Xiaoyuan Su and Taghi M Khoshgoftaar. 2009. A survey of collaborative filtering techniques. Advances in artificial intelligence (2009).
- Learning video representations using contrastive bidirectional transformer. arXiv:1906.05743 (2019).
- Revisiting unreasonable effectiveness of data in deep learning era. In Proc. of the IEEE/CVF Conference on International Conference on Computer Vision (ICCV).
- Research commentary on recommendations with side information: A survey and research directions. Electronic Commerce Research and Applications 37 (2019), 100879.
- Self-supervised learning for multimedia recommendation. IEEE Transactions on Multimedia (2022).
- A closer look at spatiotemporal convolutions for action recognition. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research (JMLR) 9, 11 (2008).
- Attention Is All You Need. In Advances in Neural Information Processing Systems (NIPS).
- DropoutNet: Addressing cold start in recommender systems. Advances in neural information processing systems 30 (2017).
- DKN: Deep knowledge-aware network for news recommendation. In Proc. of the ACM International Conference on World Wide Web (WWW).
- Tongzhou Wang and Phillip Isola. 2020. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In Proc. of the International Conference on Machine Learning (ICML).
- Contrastive learning for cold-start recommendation. In Proc. of the ACM International Conference on Multimedia.
- MMGCN: Multi-modal graph convolution network for personalized recommendation of micro-video. In Proc. of the ACM International Conference on Multimedia (MM).
- CoFi Rank-maximum margin matrix factorization for collaborative ranking. In Advances in Neural Information Processing Systems (NIPS).
- Neural news recommendation with attentive multi-view learning. (2019).
- NPA: neural news recommendation with personalized attention. In Proc. of the ACM SIGKDD International conference on knowledge discovery & data mining.
- Neural news recommendation with topic-aware news representation. In Proc. of the Annual meeting of the association for computational linguistics (ACL).
- Neural news recommendation with multi-head self-attention. In Proc. of the conference on empirical methods in natural language processing and the international joint conference on natural language processing (EMNLP-IJCNLP).
- Empowering news recommendation with pre-trained language models. In Proc. of the International ACM Conference on Research and Development in Information Retrieval (SIGIR).
- MM-Rec: Visiolinguistic Model Empowered Multimodal News Recommendation. In Proc. of the International ACM Conference on Research and Development in Information Retrieval (SIGIR).
- Graph neural networks in recommender systems: a survey. Comput. Surveys 55, 5 (2022), 1–37.
- Personalized adaptive meta learning for cold-start user preference prediction. In Proc. of the AAAI Conference on Artificial Intelligence.
- Collaborative filtering for recommender systems. In Proc. of the IEEE International Conference on Advanced Cloud and Big Data.
- Improving Item Cold-start Recommendation via Model-agnostic Conditional Variational Autoencoder. arXiv:2205.13795 (2022).
- Contrastive learning for debiased candidate generation in large-scale recommender systems. In Proc. of the ACM SIGKDD International conference on knowledge discovery & data mining.
- Deep interest network for click-through rate prediction. In Proc. of the ACM SIGKDD International conference on knowledge discovery & data mining.
- Learning to warm up cold item embeddings for cold-start recommendation with meta scaling and shifting networks. In Proc. of the International ACM Conference on Research and Development in Information Retrieval (SIGIR).
- Recommendation for new users and new items via randomized training and mixture-of-experts transformation. In Proc. of the International ACM Conference on Research and Development in Information Retrieval (SIGIR).
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.