An Image Dataset for Benchmarking Recommender Systems with Raw Pixels (2309.06789v2)

Published 13 Sep 2023 in cs.IR

Abstract: Recommender systems (RS) have achieved significant success by leveraging explicit identification (ID) features. However, the full potential of content features, especially the pure image pixel features, remains relatively unexplored. The limited availability of large, diverse, and content-driven image recommendation datasets has hindered the use of raw images as item representations. In this regard, we present PixelRec, a massive image-centric recommendation dataset that includes approximately 200 million user-image interactions, 30 million users, and 400,000 high-quality cover images. By providing direct access to raw image pixels, PixelRec enables recommendation models to learn item representation directly from them. To demonstrate its utility, we begin by presenting the results of several classical pure ID-based baseline models, termed IDNet, trained on PixelRec. Then, to show the effectiveness of the dataset's image features, we substitute the itemID embeddings (from IDNet) with a powerful vision encoder that represents items using their raw image pixels. This new model is dubbed PixelNet.Our findings indicate that even in standard, non-cold start recommendation settings where IDNet is recognized as highly effective, PixelNet can already perform equally well or even better than IDNet. Moreover, PixelNet has several other notable advantages over IDNet, such as being more effective in cold-start and cross-domain recommendation scenarios. These results underscore the importance of visual features in PixelRec. We believe that PixelRec can serve as a critical resource and testing ground for research on recommendation models that emphasize image pixel content. The dataset, code, and leaderboard will be available at https://github.com/westlake-repl/PixelRec.

Authors (6)

Yu Cheng (355 papers)
Yunzhu Pan (8 papers)
Jiaqi Zhang (78 papers)
Yongxin Ni (15 papers)
Aixin Sun (99 papers)
Fajie Yuan (33 papers)

Citations (9)

View on Semantic Scholar

Summary

The paper introduces PixelRec, a novel dataset with 200M user-image interactions designed for image-based recommendation research.
It benchmarks pixel-based models using robust vision encoders, demonstrating performance on par with or exceeding traditional ID-based methods.
Results highlight improved accuracy in cold-start and cross-domain scenarios, advocating for content-aware recommendation strategies.

Summary of the Paper: "An Image Dataset for Benchmarking Recommender Systems with Raw Pixels"

The paper "An Image Dataset for Benchmarking Recommender Systems with Raw Pixels" presents a novel dataset, PixelRec, aimed at advancing the research on image content-based recommender systems (RS). The proposed dataset is substantial, comprising approximately 200 million user-image interactions, 30 million users, and 400,000 high-resolution cover images. The primary objective is to explore the potential of raw pixel features in representing items, as opposed to traditional explicit identification (ID) features like userIDs and itemIDs.

Key Contributions

Introduction of PixelRec: The paper introduces PixelRec, a large-scale image-centric dataset designed to facilitate research on image content-based recommendations by providing direct access to raw image pixels.
Benchmarking Methodology: It establishes benchmark results on PixelRec using several recommendation architectures. Classical ID-based models (IDNet) are compared with pixel-based models (PixelNet), where itemID embeddings are replaced by a robust vision encoder to learn item representations directly from raw pixels.
Comparison of Different Models: The paper examines the performance of both traditional ID-based models and PixelNet baselines, highlighting scenarios where visual features enhance recommendation accuracy. The results suggest that PixelNet can achieve equal or superior performance compared to ID-based models even in non-cold-start scenarios, affirming the significance of visual features.
Cross-Domain and Cold-Start Scenarios: The paper demonstrates the benefits of PixelNet in cross-domain and cold-start recommendation scenarios, where it shows improved effectiveness due to its pretraining on PixelRec.

Strong Numerical Results

The paper presents robust comparisons, showing that PixelNet models can outperform traditional ID-based approaches in certain settings. The highlight is the equal or better performance of PixelNet in non-cold-start recommendation settings, as well as its advantages in cold-start and cross-domain scenarios. These results suggest that training models with raw visual features can bypass some limitations of ID-based methods, such as popularity bias and transfer learning barriers.

Implications and Future Developments

The implications of this research are both practical and theoretical. From a practical standpoint, PixelRec can serve as a critical testbed for developing and benchmarking models that emphasize image content in recommendations. Theoretically, the research indicates a shift towards leveraging rich modality features directly from raw data, challenging the conventional dominance of ID-based models.

Future developments in AI could see an expanding focus on modality-based recommender systems that harness content features across domains. The integration of cutting-edge vision encoders and training strategies will likely improve recommendation systems' adaptability and accuracy in more complex and diverse environments.

This paper paves the way for more nuanced and content-aware recommender systems by providing an extensively annotated, large-scale dataset, thereby encouraging the community to explore beyond traditional ID-centric approaches.

PDF Markdown

Related Papers

GitHub

GitHub - westlake-repl/PixelRec (148 stars)

Tweets

https://twitter.com/BuildUmmah/status/1921022682105344122