Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 60 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 18 tok/s Pro
GPT-5 High 14 tok/s Pro
GPT-4o 77 tok/s Pro
Kimi K2 159 tok/s Pro
GPT OSS 120B 456 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Collaboration of Large Language Models and Small Recommendation Models for Device-Cloud Recommendation (2501.05647v2)

Published 10 Jan 2025 in cs.IR, cs.AI, cs.CL, and cs.DC

Abstract: LLMs for Recommendation (LLM4Rec) is a promising research direction that has demonstrated exceptional performance in this field. However, its inability to capture real-time user preferences greatly limits the practical application of LLM4Rec because (i) LLMs are costly to train and infer frequently, and (ii) LLMs struggle to access real-time data (its large number of parameters poses an obstacle to deployment on devices). Fortunately, small recommendation models (SRMs) can effectively supplement these shortcomings of LLM4Rec diagrams by consuming minimal resources for frequent training and inference, and by conveniently accessing real-time data on devices. In light of this, we designed the Device-Cloud LLM-SRM Collaborative Recommendation Framework (LSC4Rec) under a device-cloud collaboration setting. LSC4Rec aims to integrate the advantages of both LLMs and SRMs, as well as the benefits of cloud and edge computing, achieving a complementary synergy. We enhance the practicability of LSC4Rec by designing three strategies: collaborative training, collaborative inference, and intelligent request. During training, LLM generates candidate lists to enhance the ranking ability of SRM in collaborative scenarios and enables SRM to update adaptively to capture real-time user interests. During inference, LLM and SRM are deployed on the cloud and on the device, respectively. LLM generates candidate lists and initial ranking results based on user behavior, and SRM get reranking results based on the candidate list, with final results integrating both LLM's and SRM's scores. The device determines whether a new candidate list is needed by comparing the consistency of the LLM's and SRM's sorted lists. Our comprehensive and extensive experimental analysis validates the effectiveness of each strategy in LSC4Rec.

Summary

  • The paper presents LSC4Rec, a framework that synergizes LLMs with SRMs to address real-time recommendation challenges in device-cloud environments.
  • It details collaborative training, inference, and decision mechanisms that improve recommendation accuracy, achieving performance gains up to 16.18% on benchmark datasets.
  • The study offers actionable insights for deploying efficient hybrid recommendation systems that balance cloud resource use with dynamic user preference updates.

An Overview of Collaborative Recommendation Framework Leveraging LLMs and SRMs

The paper "Collaboration of LLMs and Small Recommendation Models for Device-Cloud Recommendation" presents a novel framework—Device-Cloud LLM-SRM Collaborative Recommendation Framework (LSC4Rec)—that strategically combines LLMs and Small Recommendation Models (SRMs) to overcome real-time recommendation challenges in a device-cloud architecture. This paper explores the complementary strengths of these models, proposing a collaboration mechanism that enhances recommendation efficacy and efficiency.

Context and Motivation

With the rapid evolution of LLMs in natural language processing, their application in recommendation systems (LLM4Rec) has shown significant performance potential. However, inherent limitations hinder LLM4Rec's ability to incorporate real-time user preferences, such as the high computational cost of frequent training and inference and the deployment constraints due to their substantial parameter size. Conversely, SRMs offer computational advantages by enabling frequent updates and consuming minimal resources, making them suitable for real-time data processing on devices. Thus, integrating LLMs with SRMs emerges as an effective strategy to harness the strengths of both models within a collaborative device-cloud recommendation system.

Methodology

The LSC4Rec framework is constructed to provide a cohesive synergy between LLMs and SRMs, capitalizing on three core strategies: collaborative training, collaborative inference, and intelligent request.

  1. Collaborative Training:
    • The framework begins with independent pre-training of LLMs and SRMs on historical datasets. Subsequently, a cooperative training phase updates SRMs using candidate lists generated by LLMs to improve their ranking effectiveness. Adaptive re-training of SRMs on devices further refines real-time user interest capture, without the need for frequent model updates on the cloud.
  2. Collaborative Inference:
    • Within this strategy, LLMs generate candidate item lists deployed in cloud environments, while SRMs rerank these candidates in real-time on devices. The collaborative inference mechanism integrates scoring from both LLMs and SRMs, leveraging their unique advantages to output refined recommendations.
  3. Collaborative-Decision Request:
    • This intelligent strategy determines whether to update candidate lists by evaluating inconsistencies between LLMs' initial rankings and SRMs' rerankings. The goal is to effectively manage cloud-resource consumption by only requesting updates when beneficial.

Experimental Results

Comprehensive experiments on datasets like Amazon Beauty, Toys, and Yelp validate LSC4Rec, revealing significant performance enhancements over standalone LLMs or SRMs. Notably, LSC4Rec demonstrated an average performance increase of around 16.18%, 10.62%, and 9.38% across these datasets, underscoring the framework's ability to mitigate the lack of real-time data access commonly faced by LLMs.

Practical and Theoretical Implications

The findings from LSC4Rec have profound implications:

  • Practical: The framework highlights an innovative approach to integrating device-based computing with cloud resources, optimizing latency and computational overhead. It opens pathways for deploying personalized recommendations that efficiently leverage LLM capabilities without overwhelming cloud resources or necessitating constant updates.
  • Theoretical: The paper provides insights into the collaborative dynamics between large-scale and lightweight models, encouraging further exploration into hybrid recommendation strategies that balance adaptability and computational efficiency.

Future Directions

The paper suggests potential directions for extending LSC4Rec, such as optimizing the balance between candidate refresh rates and resource consumption, exploring other model architectures for SRMs, and adapting the framework to diverse cloud environments. These enhancements could further refine the framework's application across various recommendation domains, potentially enriching methodologies in AI-driven user interaction analytics.

In essence, the LSC4Rec framework positions itself as a pragmatic advancement in the recommendation landscape, setting a precedent for augmenting classic recommendation approaches with contemporary LLM techniques.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube