Large Language Models are Competitive Near Cold-start Recommenders for Language- and Item-based Preferences (2307.14225v1)

Published 26 Jul 2023 in cs.IR and cs.LG

Abstract: Traditional recommender systems leverage users' item preference history to recommend novel content that users may like. However, modern dialog interfaces that allow users to express language-based preferences offer a fundamentally different modality for preference input. Inspired by recent successes of prompting paradigms for LLMs, we study their use for making recommendations from both item-based and language-based preferences in comparison to state-of-the-art item-based collaborative filtering (CF) methods. To support this investigation, we collect a new dataset consisting of both item-based and language-based preferences elicited from users along with their ratings on a variety of (biased) recommended items and (unbiased) random items. Among numerous experimental results, we find that LLMs provide competitive recommendation performance for pure language-based preferences (no item preferences) in the near cold-start case in comparison to item-based CF methods, despite having no supervised training for this specific task (zero-shot) or only a few labels (few-shot). This is particularly promising as language-based preference representations are more explainable and scrutable than item-based or vector-based representations.

PDF Abstract

An Evaluation of LLMs in Near Cold-Start Recommendation Environments

Recent years have witnessed a remarkable progression in conversational recommender systems that capitalize on natural language as a medium for articulating user preferences. The paper conducted by Sanner et al., titled "LLMs are Competitive Near Cold-start Recommenders for Language- and Item-based Preferences," embarks on an inquiry into the performance of LLMs as competitors to conventional item-based collaborative filtering in scenarios where user data is sparse.

Study Overview

The paper describes a structured experiment, designed to test the efficacy of LLMs in recommendation tasks where user preference data is either entirely language-based or item-based. To this end, the authors have introduced a novel dataset, capturing user-generated language-based preference descriptions and item ratings. The paper strived to assess whether LLMs could deliver competitive recommendations when provided with language-based preferences, especially in cold-start scenarios.

Methodology

The dataset collection involved a two-phase protocol. Initially, users provided natural language descriptions detailing their movie preferences and dispreferences in conjunction with ratings for specific movies. Later, these descriptions were juxtaposed against recommendations generated both by described LLMs strategies and traditional item-based collaborative filtering methods. This setup provided a comprehensive view of how language-centric versus item-centric data could be leveraged for recommendation.

The authors explored several LLM prompting methodologies: Completion, Zero-shot, and Few-shot. These strategies were tested across language-only, item-only, and combined formats, ensuring a robust comparison against traditional recommendation methods including EASE, WR-MF, BM25-Fusion, and others.

Numerical Results

The results depicted in this paper are noteworthy for they reveal how LLM-based recommendations could parallel traditional item-based methods even when relying solely on natural language inputs. Particularly, LLMs using language-based Few-shot prompts demonstrated competitive performance, especially in the unbiased item subset, which represents unseen recommendations—a critical criterion in practical systems. Language-based preferences were computed much faster than item-based ones, signifying a promising direction for streamlined, effective conversational recommender systems.

Theoretical Implications and Future Directions

The paper posits that language-based preferences hold the potential to enhance explainability and scrutability in recommendation systems—an area of burgeoning interest in responsible AI design. Moreover, these findings underscore the versatility of LLMs in zero-shot and few-shot scenarios, showcasing their ability to adapt across various domains with minimal task-specific training.

In terms of future developments, the insights offered encourage exploring further integration of LLMs in personalized recommendation engines. Specifically, emphasis can be placed on refining prompt engineering to broaden the scope of recommendation categories and user contexts.

Conclusion

Sanner et al.'s investigation provides a compelling case for the efficacy of LLMs in cold-start recommenders, lending credence to their implementation in scenarios demanding transparency and less initial data input. Through meticulous methodology and robust analysis, the paper opens avenues for novel discourse in recommender system design, signaling an era where language-based user profiles become central to personalized content delivery.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Scott Sanner (70 papers)
Krisztian Balog (75 papers)
Filip Radlinski (15 papers)
Ben Wedin (6 papers)
Lucas Dixon (41 papers)

Citations (70)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos