Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TransAct: Transformer-based Realtime User Action Model for Recommendation at Pinterest (2306.00248v1)

Published 31 May 2023 in cs.IR and cs.AI

Abstract: Sequential models that encode user activity for next action prediction have become a popular design choice for building web-scale personalized recommendation systems. Traditional methods of sequential recommendation either utilize end-to-end learning on realtime user actions, or learn user representations separately in an offline batch-generated manner. This paper (1) presents Pinterest's ranking architecture for Homefeed, our personalized recommendation product and the largest engagement surface; (2) proposes TransAct, a sequential model that extracts users' short-term preferences from their realtime activities; (3) describes our hybrid approach to ranking, which combines end-to-end sequential modeling via TransAct with batch-generated user embeddings. The hybrid approach allows us to combine the advantages of responsiveness from learning directly on realtime user activity with the cost-effectiveness of batch user representations learned over a longer time period. We describe the results of ablation studies, the challenges we faced during productionization, and the outcome of an online A/B experiment, which validates the effectiveness of our hybrid ranking model. We further demonstrate the effectiveness of TransAct on other surfaces such as contextual recommendations and search. Our model has been deployed to production in Homefeed, Related Pins, Notifications, and Search at Pinterest.

Citations (11)

Summary

  • The paper introduces a hybrid model that combines realtime user actions with batch embeddings to capture both short- and long-term user preferences.
  • The paper employs a transformer architecture with GPU serving optimizations like CUDA kernel fusion to achieve efficient, web-scale recommendation deployment.
  • The paper validates TransAct through ablation studies and A/B tests, demonstrating notable improvements in user engagement and recommendation relevance.

TransAct: Leveraging Realtime User Actions for Enhanced Recommendations at Pinterest

Introduction

Pinterest's novel addition to its rich recommendation system, TransAct, stands as a testament to the ongoing evolution in leveraging user interactions for improved personalization. This transformer-based model, specifically designed to digest realtime user actions, is tailored to enhance Pinterest's recommendation quality by incorporating the immediate preferences exhibited through users' actions. By marrying end-to-end sequential modeling and batch-generated user embeddings, TransAct not only paves the way for a responsive recommendation system but also illustrates the hybrid model's superiority in handling web-scale personalized recommendations.

The TransAct Model

TransAct is distinguished by its capability to process sequential user action data in realtime, extracting valuable insights regarding users' short-term preferences. This is achieved through a transformer-based architecture adept at capturing the nuances in a user's recent interactions. The model uniquely integrates realtime actions with batch-generated user representations, offering a holistic view of user preferences spanning immediate to long-term interests. This hybrid approach not only capitalizes on the strengths of both realtime and batch processing but also mitigates their respective limitations, thereby ensuring efficient and up-to-date recommendations.

Realtime and Batch User Representation

At the core of TransAct's design is the seamless integration of realtime user action sequences with pre-computed batch user embeddings. This symbiosis enables the model to dynamically adjust recommendations based on the latest user interactions while anchoring the personalization on a solid foundation of historical data. The model ingeniously leverages a transformer architecture to encode the sequence of users' recent actions, facilitating the extraction of immediate interests. Concurrently, batch user representations, encapsulated through a separate model like PinnerFormer, distill long-term user preferences from extensive historical data.

Optimization for Web-scale Deployment

A critical component of TransAct's development was addressing the inherent challenges of serving a complex transformer-based model in a production environment that demands low latency and high efficiency. The adaptation of GPU serving, coupled with strategic optimizations such as CUDA kernel fusion and efficient memory management, played a pivotal role in mitigating the increased computational overhead. These measures ensured that the enhanced model could be deployed without compromising on performance or user experience, making it feasible for realtime recommendation scenarios.

Empirical Validation and Insights

TransAct's effectiveness was rigorously assessed through a combination of ablation studies, offline experiments, and online A/B testing. Its superiority was consistently demonstrated across a variety of metrics, including improvements in user engagement and relevance of recommendations. Notably, the model exhibited a remarkable capability to elevate the experience for non-core users, who typically present a more significant challenge in terms of recommendation accuracy due to their sporadic interaction patterns.

Challenges and Solutions in Production

The implementation of TransAct unveiled unique challenges, particularly related to model retraining frequency and diversity in recommendations. The observed engagement decay over time underscored the necessity for frequent model updates, emphasizing the dynamic nature of user interaction patterns. Furthermore, the initial decline in recommendation diversity was effectively countered by implementing random time window masking during training, a strategic move that diversified user feeds without sacrificing relevance.

Future Directions and Implications

The successful deployment of TransAct not only enhances the Pinterest recommendation ecosystem but also sets a precedent for future developments in personalized recommendation systems. By illustrating the practical benefits of a hybrid realtime and batch processing model, TransAct opens avenues for further exploration in efficiently handling web-scale recommendation scenarios. As the digital landscape continues to evolve, the insights gleaned from TransAct's implementation can inform the development of more responsive and personalized recommender systems across various platforms.

Conclusion

In conclusion, Pinterest's TransAct represents a significant advancement in the field of recommender systems, especially for platforms grappling with the dual challenges of scale and the need for up-to-the-minute personalization. Through its innovative hybrid model, TransAct adeptly navigates the complexities of capturing and incorporating realtime user actions into the recommendation process, setting new standards for responsiveness and relevance in personalized content delivery.