MMGRec: Multimodal Generative Recommendation with Transformer Model (2404.16555v1)

Published 25 Apr 2024 in cs.IR

Abstract: Multimodal recommendation aims to recommend user-preferred candidates based on her/his historically interacted items and associated multimodal information. Previous studies commonly employ an embed-and-retrieve paradigm: learning user and item representations in the same embedding space, then retrieving similar candidate items for a user via embedding inner product. However, this paradigm suffers from inference cost, interaction modeling, and false-negative issues. Toward this end, we propose a new MMGRec model to introduce a generative paradigm into multimodal recommendation. Specifically, we first devise a hierarchical quantization method Graph RQ-VAE to assign Rec-ID for each item from its multimodal and CF information. Consisting of a tuple of semantically meaningful tokens, Rec-ID serves as the unique identifier of each item. Afterward, we train a Transformer-based recommender to generate the Rec-IDs of user-preferred items based on historical interaction sequences. The generative paradigm is qualified since this model systematically predicts the tuple of tokens identifying the recommended item in an autoregressive manner. Moreover, a relation-aware self-attention mechanism is devised for the Transformer to handle non-sequential interaction sequences, which explores the element pairwise relation to replace absolute positional encoding. Extensive experiments evaluate MMGRec's effectiveness compared with state-of-the-art methods.

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

References (43)

Authors (6)

Han Liu (340 papers)
Yinwei Wei (36 papers)
Xuemeng Song (30 papers)
Weili Guan (35 papers)
Yuan-Fang Li (90 papers)
Liqiang Nie (191 papers)

Citations (6)

View on Semantic Scholar

Tweets

https://twitter.com/_reachsumit/status/1783668502852993214

MMGRec: Multimodal Generative Recommendation with Transformer Model (2404.16555v1)

Related Papers

Tweets