Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
134 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

360Brew: A Decoder-only Foundation Model for Personalized Ranking and Recommendation (2501.16450v3)

Published 27 Jan 2025 in cs.IR and cs.AI

Abstract: Ranking and recommendation systems are the foundation for numerous online experiences, ranging from search results to personalized content delivery. These systems have evolved into complex, multilayered architectures that leverage vast datasets and often incorporate thousands of predictive models. The maintenance and enhancement of these models is a labor intensive process that requires extensive feature engineering. This approach not only exacerbates technical debt but also hampers innovation in extending these systems to emerging problem domains. In this report, we present our research to address these challenges by utilizing a large foundation model with a textual interface for ranking and recommendation tasks. We illustrate several key advantages of our approach: (1) a single model can manage multiple predictive tasks involved in ranking and recommendation, (2) decoder models with textual interface due to their comprehension of reasoning capabilities, can generalize to new recommendation surfaces and out-of-domain problems, and (3) by employing natural language interfaces for task definitions and verbalizing member behaviors and their social connections, we eliminate the need for feature engineering and the maintenance of complex directed acyclic graphs of model dependencies. We introduce our research pre-production model, 360Brew V1.0, a 150B parameter, decoder-only model that has been trained and fine-tuned on LinkedIn's data and tasks. This model is capable of solving over 30 predictive tasks across various segments of the LinkedIn platform, achieving performance levels comparable to or exceeding those of current production systems based on offline metrics, without task-specific fine-tuning. Notably, each of these tasks is conventionally addressed by dedicated models that have been developed and maintained over multiple years by teams of a similar or larger size than our own.

Summary

  • The paper introduces a unified decoder-only foundation model that efficiently handles over 30 predictive tasks for personalized ranking and recommendation.
  • It leverages natural language interfaces to replace traditional feature engineering, enabling robust zero-shot transfer and simplified scalability.
  • Experimental results show competitive precision and recall on both in-domain and out-of-domain tasks, reducing technical debt in recommendation systems.

A Decoder-only Foundation Model for Personalized Ranking and Recommendation

The paper "A Decoder-only Foundation Model for Personalized Ranking and Recommendation" by the Foundation AI Technologies (FAIT) team at LinkedIn presents a novel approach to addressing the complexities inherent in ranking and recommendation systems. The paper introduces a 150B parameter decoder-only model, specifically designed to efficiently manage multiple predictive tasks within LinkedIn's ecosystem. This model, referred to as version 1.0, highlights key shifts from traditional ID-based methodologies, focusing instead on leveraging LLMs to enhance generalizability and ease of iteration through centralized prompt engineering.

Ranking and recommendation systems are pivotal to the user experience on many online platforms, including LinkedIn. Traditionally, these systems require extensive feature engineering and face significant hurdles in adapting to domain shifts, such as cold-start problems. The research under review seeks to alleviate these issues by employing a decoder-only model, which has been fine-tuned using LinkedIn's first-party data. This model is capable of processing over 30 distinct predictive tasks, maintaining performance on par with or exceeding current production systems, without the need for task-specific tuning. Conventional systems are typically composed of numerous dedicated models, each requiring substantial development and maintenance resources, a challenge this paper's approach ambitively addresses.

Key Contributions

  • Unified Model for Multiple Tasks: The proposed model can handle a variety of predictive tasks across different segments of LinkedIn's services. It streamlines the process by replacing multitudes of specialized models with a single foundation model, reducing technical debt associated with maintenance.
  • Utilization of Textual Interfaces: By using natural language interfaces for task definition and integrating member behaviors through text, the model supersedes traditional feature engineering. The decoder-only architecture, known for its reasoning capabilities, allows for zero-shot transfer to new domains with minimal prompting adjustments.
  • Performance and Scalability: With a focus on recommendation and re-ranking tasks, the research asserts high performance in precision and recall, especially on new and out-of-domain tasks, indicating robust zero-shot learning capabilities.

Experimental Setup and Results

The model demonstrates substantial gains in both in-domain and out-of-domain tasks, categorized as T1 and T2 in the paper. For in-domain (T1) tasks, data from previous interactions was used for training, ensuring the model's adaptability to distribution shifts common in recommendation contexts. Out-of-domain tasks (T2) test the model's ability to handle entirely new domains, showing competitive or superior performance to established systems.

The research emphasizes scalability through model and data scaling, revealing improvements in performance with increased training on diverse datasets. It also showcases enhancements in user cold-start scenarios, where fewer historical interactions are available, illustrating the model’s superiority in handling new user profiles effectively.

Future Outlook

This foundation model represents a significant step towards simplifying the architecture of recommendation systems while increasing their adaptability and reducing the need for exhaustive manual feature engineering. The potential for continued data and model scaling suggests future iterations could further enhance performance through increased parameter efficiency and broader data integration.

Practical Implications: For platform developers and data scientists, the implications are substantial. The model's flexibility and centralized structure enable more agile development cycles, streamlined maintenance, and a more effective response to rapidly changing user data and preferences.

Theoretical Implications: From a theoretical standpoint, this paper contributes to the ongoing discourse on the applicability of LLMs beyond their traditional NLP roles, underscoring their utility in various non-textual and interaction-based applications such as recommendation systems.

In conclusion, the proposed decoder-only foundation model for LinkedIn's personalized ranking and recommendation tasks represents a decisive shift towards more scalable and efficient AI-driven systems, with implications that extend beyond LinkedIn to similar platforms seeking more versatile and maintainable recommendation architectures.