Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Attention-guided Multi-step Fusion: A Hierarchical Fusion Network for Multimodal Recommendation (2304.11979v1)

Published 24 Apr 2023 in cs.IR and cs.MM

Abstract: The main idea of multimodal recommendation is the rational utilization of the item's multimodal information to improve the recommendation performance. Previous works directly integrate item multimodal features with item ID embeddings, ignoring the inherent semantic relations contained in the multimodal features. In this paper, we propose a novel and effective aTtention-guided Multi-step FUsion Network for multimodal recommendation, named TMFUN. Specifically, our model first constructs modality feature graph and item feature graph to model the latent item-item semantic structures. Then, we use the attention module to identify inherent connections between user-item interaction data and multimodal data, evaluate the impact of multimodal data on different interactions, and achieve early-step fusion of item features. Furthermore, our model optimizes item representation through the attention-guided multi-step fusion strategy and contrastive learning to improve recommendation performance. The extensive experiments on three real-world datasets show that our model has superior performance compared to the state-of-the-art models.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yan Zhou (206 papers)
  2. Jie Guo (67 papers)
  3. Hao Sun (383 papers)
  4. Bin Song (19 papers)
  5. Fei Richard Yu (31 papers)
Citations (10)

Summary

We haven't generated a summary for this paper yet.