Multi-Margin Cosine Loss: Proposal and Application in Recommender Systems (2405.04614v3)
Abstract: Recommender systems guide users through vast amounts of information by suggesting items based on their predicted preferences. Collaborative filtering-based deep learning techniques have regained popularity due to their straightforward nature, relying only on user-item interactions. Typically, these systems consist of three main components: an interaction module, a loss function, and a negative sampling strategy. Initially, researchers focused on enhancing performance by developing complex interaction modules. However, there has been a recent shift toward refining loss functions and negative sampling strategies. This shift has led to an increased interest in contrastive learning, which pulls similar pairs closer while pushing dissimilar ones apart. Contrastive learning may bring challenges like high memory demands and under-utilization of some negative samples. The proposed Multi-Margin Cosine Loss (MMCL) addresses these challenges by introducing multiple margins and varying weights for negative samples. It efficiently utilizes not only the hardest negatives but also other non-trivial negatives, offers a simpler yet effective loss function that outperforms more complex methods, especially when resources are limited. Experiments on two well-known datasets demonstrated that MMCL achieved up to a 20\% performance improvement compared to a baseline loss function when fewer number of negative samples are used.
- Oren Barkan and Noam Koenigstein. 2016. Item2vec: neural item embedding for collaborative filtering. In 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 1–6.
- Efficient neural matrix factorization without sampling for recommendation. ACM Transactions on Information Systems (TOIS) 38, 2 (2020), 1–28.
- Revisiting graph based collaborative filtering: A linear residual graph convolutional network approach. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 27–34.
- Friendship and mobility: user movement in location-based social networks. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. 1082–1090.
- Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), Vol. 1. IEEE, 539–546.
- Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM conference on recommender systems. 191–198.
- Reinforced Negative Sampling for Recommendation with Exposure Data.. In IJCAI. Macao, 2230–2236.
- Recommender systems in the era of large language models (llms). arXiv preprint arXiv:2307.02046 (2023).
- Graph collaborative signals denoising and augmentation for recommendation. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2037–2041.
- MCL: Mixed-centric loss for collaborative filtering. In Proceedings of the ACM Web Conference 2022. 2339–2347.
- Michael Gutmann and Aapo Hyvärinen. 2010. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 297–304.
- Inferring a Personalized Next Point-of-Interest Recommendation Model with Latent Behavior Patterns. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA. 137–143.
- Candidate-aware Graph Contrastive Learning for Recommendation. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1670–1679.
- Lightgcn: Simplifying and powering graph convolution network for recommendation. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval. 639–648.
- Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web. 173–182.
- Collaborative metric learning. In Proceedings of the 26th international conference on world wide web. 193–201.
- Sampling-decomposable generative adversarial recommender. Advances in Neural Information Processing Systems 33 (2020), 22629–22639.
- uCTRL: Unbiased Contrastive Representation Learning via Alignment and Uniformity for Collaborative Filtering. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2456–2460.
- Revisiting Recommendation Loss Functions through Contrastive Learning (Technical Report). arXiv preprint arXiv:2312.08520 (2023).
- Symmetric metric learning with adaptive margin for recommendation. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 4634–4641.
- Rank-geofm: A ranking based geographical factorization method for point of interest recommendation. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 433–442.
- Personalized ranking with importance sampling. In Proceedings of The Web Conference 2020. 1093–1103.
- Variational autoencoders for collaborative filtering. In Proceedings of the 2018 world wide web conference. 689–698.
- SimpleX: A simple and strong baseline for collaborative filtering. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 1243–1252.
- UltraGCN: ultra simplification of graph convolutional networks for recommendation. In Proceedings of the 30th ACM international conference on information & knowledge management. 1253–1262.
- Paolo Massa and Paolo Avesani. 2007. Trust-aware recommender systems. In Proceedings of the 2007 ACM Conference on Recommender Systems, RecSys 2007, Minneapolis, MN, USA, October 19-20, 2007. 17–24.
- Deep Content-based Recommender Systems Exploiting Recurrent Neural Networks and Linked Open Data. In Adjunct Publication of the 26th Conference on User Modeling, Adaptation and Personalization. ACM, 239–244.
- Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).
- Makbule Gulcin Ozsoy. 2016. From word embeddings to item recommendation. arXiv preprint arXiv:1601.01356 (2016).
- MP4Rec: Explainable and accurate top-n recommendations in heterogeneous information networks. IEEE Access 8 (2020), 181835–181847.
- One-class collaborative filtering. In 2008 Eighth IEEE international conference on data mining. IEEE, 502–511.
- Toward a Better Understanding of Loss Functions for Collaborative Filtering. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management. 2034–2043.
- Steffen Rendle and Christoph Freudenthaler. 2014. Improving pairwise learning for item recommendation from implicit feedback. In Proceedings of the 7th ACM international conference on Web search and data mining. 273–282.
- BPR: Bayesian personalized ranking from implicit feedback. In Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. AUAI Press, 452–461.
- Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition. 815–823.
- Dares: an asynchronous distributed recommender system using deep reinforcement learning. IEEE access 9 (2021), 83340–83354.
- BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer. In Proceedings of the 28th ACM international conference on information and knowledge management. 1441–1450.
- Neighborhood-Enhanced Supervised Contrastive Learning for Collaborative Filtering. IEEE Transactions on Knowledge and Data Engineering (2023).
- Mozhgan Tavakolifard and Kevin C. Almeroth. 2012. Social computing: an intersection of recommender systems, trust/reputation systems, and social networks. IEEE Network 26, 4 (2012), 53–58.
- Meta-prod2vec: Product embeddings using side-information for recommendation. In Proceedings of the 10th ACM conference on recommender systems. 225–232.
- Distributionally Robust Graph-based Recommendation System. arXiv preprint arXiv:2402.12994 (2024).
- Towards representation alignment and uniformity in collaborative filtering. In Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. 1816–1825.
- Feng Wang and Huaping Liu. 2021. Understanding the behaviour of contrastive loss. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2495–2504.
- Neural graph collaborative filtering. In Proceedings of the 42nd international ACM SIGIR conference on Research and development in Information Retrieval. 165–174.
- Lilian Weng. 2021. Contrastive Representation Learning. lilianweng.github.io (May 2021). https://lilianweng.github.io/posts/2021-05-31-contrastive/
- BSL: Understanding and Improving Softmax Loss for Recommendation. arXiv preprint arXiv:2312.12882 (2023).
- On the effectiveness of sampled softmax loss for item recommendation. ACM Transactions on Information Systems (2022).
- Does Negative Sampling Matter? A Review with Insights into its Theory and Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence (2024).
- Location Recommendation for Location-based Social Networks. In Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems (San Jose, California) (GIS ’10). ACM, New York, NY, USA, 458–461.
- Yelp. 2018. Yelp2018 Dataset. (2018). https://www.yelp.com/dataset.
- Selection of negative samples for one-class matrix factorization. In Proceedings of the 2017 SIAM International Conference on Data Mining. SIAM, 363–371.
- Incorporating bias-aware margins into contrastive loss for collaborative filtering. Advances in Neural Information Processing Systems 35 (2022), 7866–7878.
- Empowering Collaborative Filtering with Principled Adversarial Contrastive Loss. Advances in Neural Information Processing Systems 36 (2024).
- Optimizing top-n collaborative filtering via dynamic negative item sampling. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. 785–788.
- DRN: A deep reinforcement learning framework for news recommendation. In Proceedings of the 2018 world wide web conference. 167–176.
- Adaptive popularity debiasing aggregator for graph collaborative filtering. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 7–17.
- Learning explicit user interest boundary for recommendation. In Proceedings of the ACM Web Conference 2022. 193–202.