Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LGMRec: Local and Global Graph Learning for Multimodal Recommendation (2312.16400v2)

Published 27 Dec 2023 in cs.IR

Abstract: The multimodal recommendation has gradually become the infrastructure of online media platforms, enabling them to provide personalized service to users through a joint modeling of user historical behaviors (e.g., purchases, clicks) and item various modalities (e.g., visual and textual). The majority of existing studies typically focus on utilizing modal features or modal-related graph structure to learn user local interests. Nevertheless, these approaches encounter two limitations: (1) Shared updates of user ID embeddings result in the consequential coupling between collaboration and multimodal signals; (2) Lack of exploration into robust global user interests to alleviate the sparse interaction problems faced by local interest modeling. To address these issues, we propose a novel Local and Global Graph Learning-guided Multimodal Recommender (LGMRec), which jointly models local and global user interests. Specifically, we present a local graph embedding module to independently learn collaborative-related and modality-related embeddings of users and items with local topological relations. Moreover, a global hypergraph embedding module is designed to capture global user and item embeddings by modeling insightful global dependency relations. The global embeddings acquired within the hypergraph embedding space can then be combined with two decoupled local embeddings to improve the accuracy and robustness of recommendations. Extensive experiments conducted on three benchmark datasets demonstrate the superiority of our LGMRec over various state-of-the-art recommendation baselines, showcasing its effectiveness in modeling both local and global user interests.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. Graph convolutional matrix completion. arXiv preprint arXiv:1706.02263.
  2. LightGCL: Simple Yet Effective Graph Contrastive Learning for Recommendation. arXiv preprint arXiv:2302.08191.
  3. Attentive collaborative filtering: Multimedia recommendation with item-and component-level attention. In Proceedings of SIGIR, 335–344.
  4. Revisiting graph based collaborative filtering: A linear residual graph convolutional network approach. In Proceedings of AAAI, volume 34, 27–34.
  5. Context-aware image tweet modelling and recommendation. In Proceedings of ACM MM, 1018–1027.
  6. Personalized fashion recommendation with visual explanations based on multimodal attention network: Towards visually explainable recommendation. In Proceedings of SIGIR, 765–774.
  7. How to learn item representation for cold-start multimedia recommendation? In Proceedings of ACM MM, 3469–3477.
  8. Invariant Representation Learning for Multimedia Recommendation. In Proceedings of the 30th ACM International Conference on Multimedia, 619–628.
  9. Hypergraph neural networks. In Proceedings of AAAI, volume 33, 3558–3565.
  10. A unified personalized video recommendation via dynamic recurrent neural networks. In Proceedings of ACM MM, 127–135.
  11. Hypergraph learning: Methods and practices. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(5): 2548–2566.
  12. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of AISTATS, 249–256.
  13. TopicVAE: Topic-aware Disentanglement Representation Learning for Enhanced Recommendation. In Proceedings of the 30th ACM International Conference on Multimedia, 511–520.
  14. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of AISTATS, 297–304.
  15. Deep residual learning for image recognition. In Proceedings of CVPR, 770–778.
  16. Click-through rate prediction with multi-modal hypergraphs. In Proceedings of CIKM, 690–699.
  17. VBPR: visual bayesian personalized ranking from implicit feedback. In Proceedings of AAAI, volume 30.
  18. Trirank: Review-aware explainable recommendation by modeling aspects. In Proceedings of CIKM, 1661–1670.
  19. Lightgcn: Simplifying and powering graph convolution network for recommendation. In Proceedings of SIGIR, 639–648.
  20. Hierarchical graph convolutional networks for semi-supervised node classification. arXiv preprint arXiv:1902.06667.
  21. Categorical Reparameterization with Gumbel-Softmax. In Proceedings of ICLR.
  22. Dual channel hypergraph collaborative filtering. In Proceedings of SIGKDD, 2020–2029.
  23. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT, 4171–4186.
  24. MARIO: Modality-Aware Attention and Modality-Preserving Decoders for Multimedia Recommendation. In Proceedings of CIKM, 993–1002.
  25. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
  26. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.
  27. Bootstrapping user and item representations for one-class collaborative filtering. In Proceedings of SIGIR, 317–326.
  28. MDGCF: Multi-Dependency Graph Collaborative Filtering with Neighborhood-and Homogeneous-level Dependencies. In Proceedings of CIKM, 1094–1103.
  29. Improving graph collaborative filtering with neighborhood-enriched contrastive learning. In Proceedings of WWW, 2320–2329.
  30. User diverse preference modeling by multimodal attentive metric learning. In Proceedings of the 27th ACM International Conference on Multimedia, 1526–1534.
  31. Deepstyle: Learning user preferences for visual recommendation. In Proceedings of SIGIR, 841–844.
  32. EliMRec: Eliminating Single-modal Bias in Multimedia Recommendation. In Proceedings of the 30th ACM International Conference on Multimedia, 687–695.
  33. Disentangled graph convolutional networks. In Proceedings of ICML, 4212–4221.
  34. UltraGCN: ultra simplification of graph convolutional networks for recommendation. In Proceedings of CIKM, 1253–1262.
  35. Image-based recommendations on styles and substitutes. In Proceedings of SIGIR, 43–52.
  36. BPR: Bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618.
  37. Neighbor interaction aware graph convolution networks for recommendation. In Proceedings of SIGIR, 1289–1298.
  38. Multi-graph convolution collaborative filtering. In Proceedings of ICDM, 1306–1311.
  39. Multi-modal knowledge graphs for recommender systems. In Proceedings of CIKM, 1405–1414.
  40. Self-supervised learning for multimedia recommendation. IEEE Transactions on Multimedia.
  41. Next-item recommendation with sequential hypergraphs. In Proceedings of SIGIR, 1101–1110.
  42. Dualgnn: Dual graph neural network for multimedia recommendation. IEEE Transactions on Multimedia.
  43. Neural graph collaborative filtering. In Proceedings of SIGIR, 165–174.
  44. Multi-Modal Self-Supervised Learning for Recommendation. In Proceedings of WWW.
  45. Contrastive learning for cold-start recommendation. In Proceedings of ACM MM, 5382–5390.
  46. Graph-refined convolutional network for multimedia recommendation with implicit feedback. In Proceedings of the 28th ACM International Conference on Multimedia, 3541–3549.
  47. MMGCN: Multi-modal graph convolution network for personalized recommendation of micro-video. In Proceedings of the 27th ACM International Conference on Multimedia, 1437–1445.
  48. Self-supervised graph learning for recommendation. In Proceedings of SIGIR, 726–735.
  49. Hypergraph contrastive collaborative filtering. In Proceedings of SIGIR, 70–79.
  50. Self-supervised hypergraph transformer for recommender systems. In Proceedings of SIGKDD, 2100–2109.
  51. Self-supervised hypergraph convolutional networks for session-based recommendation. In Proceedings of AAAI, volume 35, 4503–4511.
  52. Enhanced graph learning for collaborative filtering via mutual information maximization. In Proceedings of SIGIR, 71–80.
  53. Multi-modal graph contrastive learning for micro-video recommendation. In Proceedings of SIGIR, 1807–1811.
  54. Graph convolutional neural networks for web-scale recommender systems. In Proceedings of SIGKDD, 974–983.
  55. Self-supervised multi-channel hypergraph convolutional network for social recommendation. In Proceedings of WWW, 413–424.
  56. Are graph augmentations necessary? simple graph contrastive learning for recommendation. In Proceedings of SIGIR, 1294–1303.
  57. Mining latent structures for multimedia recommendation. In Proceedings of the 29th ACM International Conference on Multimedia, 3872–3880.
  58. Latent Structure Mining with Contrastive Modality Fusion for Multimedia Recommendation. IEEE Transactions on Knowledge and Data Engineering.
  59. Price does matter! modeling price and interest preferences in session-based recommendation. In Proceedings of SIGIR, 1684–1693.
  60. Zhou, X. 2023. MMRec: Simplifying Multimodal Recommendation. arXiv preprint arXiv:2302.03497.
  61. Bootstrap Latent Representations for Multi-Modal Recommendation. In Proceedings of WWW, 845–854.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Zhiqiang Guo (10 papers)
  2. Jianjun Li (15 papers)
  3. Guohui Li (12 papers)
  4. Chaoyang Wang (52 papers)
  5. Si Shi (5 papers)
  6. Bin Ruan (1 paper)
Citations (7)

Summary

We haven't generated a summary for this paper yet.