Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
89 tokens/sec
Gemini 2.5 Pro Premium
41 tokens/sec
GPT-5 Medium
23 tokens/sec
GPT-5 High Premium
19 tokens/sec
GPT-4o
96 tokens/sec
DeepSeek R1 via Azure Premium
88 tokens/sec
GPT OSS 120B via Groq Premium
467 tokens/sec
Kimi K2 via Groq Premium
197 tokens/sec
2000 character limit reached

Mirror Gradient: Towards Robust Multimodal Recommender Systems via Exploring Flat Local Minima (2402.11262v1)

Published 17 Feb 2024 in cs.IR and cs.LG

Abstract: Multimodal recommender systems utilize various types of information to model user preferences and item features, helping users discover items aligned with their interests. The integration of multimodal information mitigates the inherent challenges in recommender systems, e.g., the data sparsity problem and cold-start issues. However, it simultaneously magnifies certain risks from multimodal information inputs, such as information adjustment risk and inherent noise risk. These risks pose crucial challenges to the robustness of recommendation models. In this paper, we analyze multimodal recommender systems from the novel perspective of flat local minima and propose a concise yet effective gradient strategy called Mirror Gradient (MG). This strategy can implicitly enhance the model's robustness during the optimization process, mitigating instability risks arising from multimodal information inputs. We also provide strong theoretical evidence and conduct extensive empirical experiments to show the superiority of MG across various multimodal recommendation models and benchmarks. Furthermore, we find that the proposed MG can complement existing robust training methods and be easily extended to diverse advanced recommendation models, making it a promising new and fundamental paradigm for training multimodal recommender systems. The code is released at https://github.com/Qrange-group/Mirror-Gradient.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (67)
  1. Harnessing multimodal data integration to advance precision oncology. Nature Reviews Cancer 22, 2 (2022), 114–126.
  2. Léon Bottou. 2012. Stochastic gradient descent tricks. In Neural Networks: Tricks of the Trade: Second Edition. Springer, 421–436.
  3. Huiyuan Chen and Jing Li. 2019. Adversarial tensor factorization for context-aware recommendation. In Proceedings of the 13th ACM Conference on Recommender Systems. 363–367.
  4. Bias and debias in recommender system: A survey and future directions. ACM Transactions on Information Systems 41, 3 (2023), 1–39.
  5. A survey on adversarial recommender systems: from attack/defense strategies to generative adversarial networks. ACM Computing Surveys (CSUR) 54, 2 (2021), 1–38.
  6. Benoit et al Dherin. 2021. The Geometric Occam’s Razor Implicit in Deep Learning. arXiv preprint arXiv:2111.15090 (2021).
  7. Efficient sharpness-aware minimization for improved training of neural networks. arXiv preprint arXiv:2110.03141 (2021).
  8. Enhancing the robustness of neural collaborative filtering systems under malicious attacks. IEEE Transactions on Multimedia 21, 3 (2018), 555–565.
  9. Adaptive subgradient methods for online learning and stochastic optimization. Journal of machine learning research 12, 7 (2011).
  10. Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412 (2020).
  11. Graph neural networks for recommender system. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining. 1623–1625.
  12. Alex Graves. 2013. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850 (2013).
  13. Dynamically Expandable Graph Convolution for Streaming Recommendation. arXiv preprint arXiv:2303.11700 (2023).
  14. Asymmetric valleys: Beyond sharp and flat local minima. Advances in neural information processing systems 32 (2019).
  15. Ruining He and Julian McAuley. 2016. VBPR: visual bayesian personalized ranking from implicit feedback. In Proceedings of the AAAI conference on artificial intelligence, Vol. 30.
  16. Blending pruning criteria for convolutional neural networks. In ICANN 2021: 30th International Conference on Artificial Neural Networks, 2021. Springer, 3–15.
  17. Lightgcn: Simplifying and powering graph convolution network for recommendation. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval. 639–648.
  18. Sepp Hochreiter and Jürgen Schmidhuber. 1994. Simplifying neural nets by discovering flat minima. Advances in neural information processing systems 7 (1994).
  19. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Flat minima. Neural computation 9, 1 (1997), 1–42.
  20. AlterSGD: Finding Flat Minima for Continual Learning by Alternative Training. arXiv preprint arXiv:2107.05804 (2021).
  21. Understanding Self-attention Mechanism via Dynamical System Perspective. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1412–1422.
  22. Layer-wise shared attention network on dynamical system perspective. arXiv preprint arXiv:2210.16101 (2022).
  23. Dianet: Dense-and-implicit attention network. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 4206–4214.
  24. Rethinking the pruning criteria for convolutional neural network. Advances in Neural Information Processing Systems 34 (2021), 16305–16318.
  25. Scalelong: Towards more stable training of diffusion model via scaling network long skip connection. Advances in Neural Information Processing Systems 36 (2024).
  26. When do flat minima optimizers work? Advances in Neural Information Processing Systems 35 (2022), 16577–16595.
  27. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
  28. Asam: Adaptive sharpness-aware minimization for scale-invariant learning of deep neural networks. In International Conference on Machine Learning. PMLR, 5905–5914.
  29. Addressing cold-start problem in recommendation systems. In Proceedings of the 2nd international conference on Ubiquitous information management and communication. 208–211.
  30. Visualizing the loss landscape of neural nets. Advances in neural information processing systems 31 (2018).
  31. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. arXiv preprint arXiv:2301.12597 (2023).
  32. Adversarial learning to compare: Self-attentive prospective customer recommendation in location based social networks. In Proceedings of the 13th International Conference on Web Search and Data Mining. 349–357.
  33. Instance enhancement batch normalization: An adaptive regulator of batch noise. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 4819–4827.
  34. Disentangling the Performance Puzzle of Multimodal-aware Recommender Systems. In EvalRS@ KDD (CEUR Workshop Proceedings, Vol. 3450). CEUR-WS. org.
  35. Image-based recommendations on styles and substitutes. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval. 43–52.
  36. Make sharpness-aware minimization stronger: A sparsified perturbation approach. Advances in Neural Information Processing Systems 35 (2022), 30950–30962.
  37. A two-stage embedding model for recommendation with multimodal auxiliary information. Information Sciences 582 (2022), 22–37.
  38. BPR: Bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618 (2012).
  39. Cornac: A comparative framework for multimodal recommender systems. The Journal of Machine Learning Research 21, 1 (2020), 3803–3807.
  40. Methods and metrics for cold-start recommendations. In Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval. 253–260.
  41. Overcoming catastrophic forgetting in incremental few-shot learning by finding flat minima. Advances in neural information processing systems 34 (2021), 6747–6761.
  42. Enhancing Hierarchy-Aware Graph Networks with Deep Dual Clustering for Session-based Recommendation. In Proceedings of the ACM Web Conference 2023. 165–176.
  43. On the importance of initialization and momentum in deep learning. In International conference on machine learning. PMLR, 1139–1147.
  44. Adversarial training towards robust multimedia recommender system. IEEE Transactions on Knowledge and Data Engineering 32, 5 (2019), 855–867.
  45. Jiaxi Tang and Ke Wang. 2018. Ranking distillation: Learning compact ranking models with high performance for recommender system. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 2289–2298.
  46. Self-supervised learning for multimedia recommendation. IEEE Transactions on Multimedia (2022).
  47. Collaborative deep learning for recommender systems. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. 1235–1244.
  48. Dualgnn: Dual graph neural network for multimedia recommendation. IEEE Transactions on Multimedia (2021).
  49. Graph-refined convolutional network for multimedia recommendation with implicit feedback. In Proceedings of the 28th ACM international conference on multimedia. 3541–3549.
  50. MMGCN: Multi-modal graph convolution network for personalized recommendation of micro-video. In Proceedings of the 27th ACM international conference on multimedia. 1437–1445.
  51. Fight fire with fire: towards robust recommender systems via adversarial poisoning training. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1074–1083.
  52. ConsRec: Learning Consensus Behind Interactions for Group Recommendation. In Proceedings of the ACM Web Conference 2023. 240–250.
  53. Collaborative knowledge base embedding for recommender systems. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 353–362.
  54. Diffusion-based graph contrastive learning for recommendation with implicit feedback. In International Conference on Database Systems for Advanced Applications. Springer, 232–247.
  55. Penalizing gradient norm for efficiently improving generalization in deep learning. In International Conference on Machine Learning. PMLR, 26982–26992.
  56. Let’s Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation. arXiv preprint arXiv:2312.02439 (2023).
  57. ASR: Attention-alike Structural Re-parameterization. arXiv preprint arXiv:2304.06345 (2023).
  58. Sur-adapter: Enhancing text-to-image pre-trained diffusion models with large language models. In Proceedings of the 31st ACM International Conference on Multimedia. 567–578.
  59. CEM: Machine-Human Chatting Handoff via Causal-Enhance Module. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 3242–3253.
  60. Enhancing Dyadic Relations with Homogeneous Graphs for Multimodal Recommendation. arXiv preprint arXiv:2301.12097 (2023).
  61. A Comprehensive Survey on Multimodal Recommender Systems: Taxonomy, Evaluation, and Future Directions. arXiv preprint arXiv:2302.04473 (2023).
  62. Xin Zhou. 2022. A tale of two graphs: Freezing and denoising graph structures for multimodal recommendation. arXiv preprint arXiv:2211.06924 (2022).
  63. Evaluating reputation of web services under rating scarcity. In 2016 IEEE International Conference on Services Computing (SCC). IEEE, 211–218.
  64. Layer-refined graph convolutional networks for recommendation. In 2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE, 1247–1259.
  65. Selfcf: A simple framework for self-supervised collaborative filtering. ACM Transactions on Recommender Systems 1, 2 (2023), 1–25.
  66. Bootstrap latent representations for multi-modal recommendation. In Proceedings of the ACM Web Conference 2023. 845–854.
  67. Surrogate gap minimization improves sharpness-aware training. arXiv preprint arXiv:2203.08065 (2022).
Citations (4)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.