Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

M6-Rec: Generative Pretrained Language Models are Open-Ended Recommender Systems (2205.08084v2)

Published 17 May 2022 in cs.IR

Abstract: Industrial recommender systems have been growing increasingly complex, may involve \emph{diverse domains} such as e-commerce products and user-generated contents, and can comprise \emph{a myriad of tasks} such as retrieval, ranking, explanation generation, and even AI-assisted content production. The mainstream approach so far is to develop individual algorithms for each domain and each task. In this paper, we explore the possibility of developing a unified foundation model to support \emph{open-ended domains and tasks} in an industrial recommender system, which may reduce the demand on downstream settings' data and can minimize the carbon footprint by avoiding training a separate model from scratch for every task. Deriving a unified foundation is challenging due to (i) the potentially unlimited set of downstream domains and tasks, and (ii) the real-world systems' emphasis on computational efficiency. We thus build our foundation upon M6, an existing large-scale industrial pretrained LLM similar to GPT-3 and T5, and leverage M6's pretrained ability for sample-efficient downstream adaptation, by representing user behavior data as plain texts and converting the tasks to either language understanding or generation. To deal with a tight hardware budget, we propose an improved version of prompt tuning that outperforms fine-tuning with negligible 1\% task-specific parameters, and employ techniques such as late interaction, early exiting, parameter sharing, and pruning to further reduce the inference time and the model size. We demonstrate the foundation model's versatility on a wide range of tasks such as retrieval, ranking, zero-shot recommendation, explanation generation, personalized content creation, and conversational recommendation, and manage to deploy it on both cloud servers and mobile devices.

Citations (162)

Summary

  • The paper presents a unified M6-Rec model that transforms recommendation tasks into language processing problems using plain-text user behavior representations.
  • The paper innovates efficient tuning techniques, including option and adapter tuning, to achieve competitive performance with minimal task-specific parameters.
  • The paper demonstrates that M6-Rec can be deployed on both cloud and mobile devices with reduced inference time for real-time, open-ended recommendation tasks.

Overview of M6-Rec Foundation Model in Recommender Systems

In the field of industrial recommender systems where complexity and diversity are prevalent, the paper "M6-Rec: Generative Pretrained LLMs are Open-Ended Recommender Systems" authored by Cui et al., presents notable advancements. The paper advocates for the use of a unified foundation model to address the myriad tasks within these systems, ranging from content retrieval and ranking to explanation generation and personalized content creation. The central proposal of the work is M6-Rec, a generative pretrained LLM capable of operating across open-ended domains without the need for extensive data or carbon-intensive training in downstream settings.

Key Contributions

  1. Unified Model Architecture: The authors introduce M6-Rec, building on M6, a large-scale industrial pretrained LLM. The capability to support diverse tasks is enabled by representing user behavior as plain texts, subsequently addressing these tasks through either language understanding or generation. This approach provides versatility, particularly in zero-shot recommendation scenarios.
  2. Efficient Tuning Techniques: Facing the constraints of computational efficiency, the paper innovatively proposes option tuning, a derivative of prompt tuning, that achieves superior performance with minimal task-specific parameters. In empirical tests, option tuning combined with adapter tuning achieved competitive, sometimes superior results compared to traditional fine-tuning methods.
  3. Enhancements for Deployment: M6-Rec is tailored for deployment in diverse environments, including both cloud and mobile devices. Late interaction and multi-segment processing reduces inference time markedly, making it feasible to integrate M6-Rec into real-time recommendation tasks.

Results and Implications

The experimental results are revealing. M6-Rec competes effectively with conventional methods in click-through rate (CTR) prediction, retrieval tasks, personalized query generation, and explanation generation. Its zero-shot capabilities are particularly intriguing, offering industrial applications potential cost and resource savings. Furthermore, the paper's presentations demonstrate that M6-Rec can be scaled down significantly via distillation and model compression techniques to operate on resource-constrained edge devices, without substantial loss of efficacy.

The paper provides empirical evidence supporting M6-Rec's robustness and flexibility, and highlights opportunity for future research in expanding this model to multimodal settings, which could further enhance its applicability across data types such as images and video.

Future Directions

The authors acknowledge that while M6-Rec's current applications show promise, broader explorations into multimodal applications could unlock greater potential of foundation models. Future research might delve into integrating image and text modalities seamlessly within M6-Rec's framework, investigating the inherent challenges and exploring the benefits such integration might confer to next-generation recommender systems.

In conclusion, M6-Rec offers a compelling approach to unifying various aspects of industrial recommender systems under a single adaptable model, setting the stage for future innovations across open-domain and multi-task recommendations. The technical advancements presented in this paper, including efficient tuning and deployment strategies, open avenues for more sustainable and versatile recommendation systems in commercial and research domains alike.