Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks (2403.05185v1)

Published 8 Mar 2024 in cs.IR and cs.LG

Abstract: In the ever-evolving digital audio landscape, Spotify, well-known for its music and talk content, has recently introduced audiobooks to its vast user base. While promising, this move presents significant challenges for personalized recommendations. Unlike music and podcasts, audiobooks, initially available for a fee, cannot be easily skimmed before purchase, posing higher stakes for the relevance of recommendations. Furthermore, introducing a new content type into an existing platform confronts extreme data sparsity, as most users are unfamiliar with this new content type. Lastly, recommending content to millions of users requires the model to react fast and be scalable. To address these challenges, we leverage podcast and music user preferences and introduce 2T-HGNN, a scalable recommendation system comprising Heterogeneous Graph Neural Networks (HGNNs) and a Two Tower (2T) model. This novel approach uncovers nuanced item relationships while ensuring low latency and complexity. We decouple users from the HGNN graph and propose an innovative multi-link neighbor sampler. These choices, together with the 2T component, significantly reduce the complexity of the HGNN model. Empirical evaluations involving millions of users show significant improvement in the quality of personalized recommendations, resulting in a +46% increase in new audiobooks start rate and a +23% boost in streaming rates. Intriguingly, our model's impact extends beyond audiobooks, benefiting established products like podcasts.

References (49)

Authors (14)

Marco De Nadai (26 papers)
Francesco Fabbri (22 papers)
Paul Gigioli (2 papers)
Alice Wang (9 papers)
Ang Li (472 papers)
Fabrizio Silvestri (75 papers)
Laura Kim (5 papers)
Shawn Lin (1 paper)
Vladan Radosavljevic (14 papers)
Sandeep Ghael (1 paper)
David Nyhan (1 paper)
Hugues Bouchard (6 papers)
Mounia Lalmas-Roelleke (2 papers)
Andreas Damianou (28 papers)

Citations (6)

View on Semantic Scholar

Summary

Unveiling Spotify's Approach to Personalized Audiobook Recommendations with Graph Neural Networks

Introduction

Spotify, initially celebrated for its comprehensive music and podcast offerings, has recently broadened its horizon by introducing audiobooks to its platform. This addition, though promising, introduces several challenges in personalization and recommendation. To address these challenges, a novel approach leveraging Graph Neural Networks (GNNs) and a Two-Tower model has been developed, demonstrating a significant improvement in the recommendation quality for audiobooks. This post explores the architecture, findings, and implications of this innovative recommendation system.

Challenges in Audiobook Recommendations

The introduction of audiobooks to an established platform like Spotify entails unique challenges:

Data Sparsity: The novelty of audiobooks on Spotify means limited user interaction data, making it harder to generate relevant recommendations.
High Stakes for Recommendations: Given that audiobooks initially required purchase before listening, there's a greater emphasis on the accuracy and relevance of recommendations.
Need for Scalability: The system must efficiently cater to Spotify’s vast user base without compromising on latency.

Solution Overview: 2T-HGNN

To navigate these challenges, the 2T-HGNN (Two-Tower Heterogeneous Graph Neural Network) model was introduced. The model creatively utilizes existing user preferences for podcasts and music to enhance audiobook recommendations. It encapsulates a dual approach by employing:

A Heterogeneous Graph Neural Network (HGNN) that learns nuanced item relationships from user interactions with music and podcasts.
A Two Tower (2T) model that ensures scalable and efficient recommendations by generating user and item embeddings, benefitting from low latency and reduced complexity.

The crux of this system lies in its ingenious architecture that separate content relationships and user-item interactions, thereby managing to reduce complexity and enhance scalability.

Empirical Evaluations

The system was rigorously evaluated, depicting a remarkable +46% increase in new audiobook start rates and a +23% boost in streaming rates among Spotify users. These figures not only validate the effectiveness of the 2T-HGNN model but also highlight the potential of leveraging cross-content type interactions (e.g., between podcasts and audiobooks) for recommendations.

Implications and Future Directions

This research holds profound implications for the future of AI in digital content recommendation. The success of the 2T-HGNN model underscores the significance of:

Cross-Content Learning: Utilizing user preferences across different content types can significantly improve recommendation systems.
Scalable AI Models: The necessity for AI models that are not only accurate but also scalable and efficient in handling vast amounts of data and users.
Graph Neural Networks: The potential of GNNs, especially HGNNs, in unraveling complex item relationships and enhancing recommendation quality.

Moreover, the architectural novelty of decoupling content relationships from user-item interactions offers a blueprint for future recommendation systems across various digital platforms.

Concluding Thoughts

The introduction and success of the 2T-HGNN model mark a significant leap toward personalized content recommendations on digital platforms. By leveraging the connections between different types of content and addressing the unique challenges of audiobook recommendations, Spotify has set a new benchmark for personalization in the streaming industry. The insights gained from this research pave the way for future advancements in recommendation systems, promising enhanced user experiences across a myriad of digital services.