- The paper reframes cold-start recommendation as a semi-cold problem by directly using artist-level collaborative signals to improve prediction accuracy.
- ACARec employs a two-stage attention mechanism that fuses content and collaborative embeddings, achieving up to 34.8% higher discovery NDCG than baselines.
- Model ablation confirms that integrating artist catalog context with GRU-based fusion is critical for robust, scalable performance across diverse datasets.
Leveraging Artist Catalogs for Cold-Start Music Recommendation: An Expert Review
Problem Context and Motivation
The item cold-start challenge in music recommendation systems (MRSs) refers to the task of predicting user preferences for newly released tracks that lack interaction data, which is essential for collaborative filtering (CF) models. Traditionally, content-based approaches have employed audio, textual, or metadata features to generate embeddings for cold tracks, mapping these into the latent CF space [van2013deep]. However, most prior works have incorporated artist information only as an auxiliary feature, failing to directly exploit the artist-track hierarchy present in most catalogs. Notably, the vast majority of new tracks originate from artists already seen in the training data—a situation better described as "semi-cold," where artist-level collaborative signals are available even when track-level interactions are not.
Artist-Aware Framing and Model Architecture
This paper proposes reframing cold-start track recommendation as a "semi-cold" problem by leveraging rich collaborative information at the artist level. The authors introduce ACARec (Artist Catalog Attention Recommender), an attention-based neural architecture that synthesizes collaborative embeddings for cold tracks by attending over the artist’s existing catalog. The intuition is that the listener response to an artist’s prior work contains significant predictive signal for their future releases. The ACARec model is designed to answer the central question: among the catalog of an artist, which tracks provide the most informative context for modeling the expected collaborative behavior of a new release?
The model takes as input a cold track’s audio embedding and the content and collaborative embeddings of all the artist’s previous (hot) tracks. Through a two-stage attention mechanism—self-attention over the catalog to create context-aware representations, followed by cross-attention conditioned on the target track’s audio embedding—ACARec produces a content-conditioned summary of the artist’s collaborative signatures. This output is then fused via a Gated Recurrent Unit (GRU) with a mean embedding of the artist's catalog in the collaborative space, allowing adaptive deviation from the artist prototype based on content (see model ablations).
Empirical Evaluation and Results
The experimental design focuses on five key research questions (RQ1–RQ5), assessing gains from artist-aware modeling, comparative performance, predictive behavior regarding popularity, the impacts of different catalog sampling strategies, and the contributions of ACARec components via ablation.
Artist Signal Amplification
The authors show that augmenting content-based cold-start methods with the artist mean embedding yields large accuracy improvements—often doubling or tripling Recall and NDCG over audio-only baselines—across datasets and splits (e.g., M4A-Onion, Yambda-50m). The artist mean itself is a highly effective predictor, and simple weighting with track popularity or audio similarity further boosts performance.
Figure 1: Incorporating ArtistMean into traditional cold-start baselines consistently increases accuracy across settings and datasets.
ACARec outperforms all tested baselines, including DeepMusic and generative adversarial ranking methods augmented with the artist mean. The improvements are particularly pronounced in the artist discovery scenario (i.e., users encountering a new artist), with ACARec achieving 34.8% higher Discovery NDCG@20 on Yambda-50m and statistically significant gains in other splits.
Figure 2: Discovery Recall@20 and ACARec's improvement over baselines, stratified by user artist count quintile; gains are largest among users with greater artist diversity.
The performance advantage extends even as the user history becomes more diverse (users with more followed artists), indicating superior handling of personalization for complex listener interests.
Popularity Prediction and Artist Bias
The study investigates whether model improvements arise from better estimation of cold-item popularity or from popularity bias. By stratifying tracks and artists into quintiles by popularity, the authors find that ACARec’s predictions are more evenly distributed across the interaction spectrum. ACARec accurately allocates recommendations to both head and tail items and artists, exceeding baselines especially among high-impact (popular) cold releases, without neglecting less popular ones.
Figure 3: Distribution of ACARec and baseline predictions across interaction and artist popularity quintiles. ACARec achieves more balanced discovery across popularity levels.
Artist Catalog Sampling and Model Robustness
Efficiently incorporating large artist catalogs can be challenging. Experiments show that, while small random subsets of an artist’s catalog suffice for training, using the top-20 most popular tracks at inference achieves near-maximum accuracy. This enables practical scaling while preserving recommendation quality.
Ablation Analysis
Component-wise ablation demonstrates that both the self-attention context over the artist catalog and the late concatenation of content and collaborative features are critical. Direct fusion without the artist mean, or residual methods that lack a learnable gating mechanism, underperform the full ACARec design which uses a GRU-based fusion. Thus, adaptively weighting the artist prototype and the content-conditioned context is essential for optimal performance.
Theoretical and Practical Implications
This study substantiates the claim that most item cold-start problems in music recommendation are, in practice, "semi-cold": for 85–93% of new track consumption, an artist context with collaborative signal exists. By directly modeling and attending over this context, ACARec bridges the semantic gap between content and collaborative signals more effectively than prior approaches. Strong empirical results across datasets and granular user groupings confirm the centrality of artist-aware modeling for cold-start recommendation.
Theoretically, the work supports a shift in research focus: from treating cold tracks as isolated content vectors towards explicitly encoding hierarchical catalog relationships, with the artist as a primary connector of collaborative semantics. Practically, the ACARec paradigm is flexible—being agnostic to the base CF model, tolerant to catalog size reductions, and robust across evaluation metrics and splits.
Future Research Directions
First, fully cold artist scenarios remain unresolved; hybrid techniques that dynamically synthesize artist prototypes from content or leverage generalized artist embeddings may close this gap. Second, modeling multi-artist collaborations, producer/influencer relationships, or integrating richer artist-level metadata (e.g., via graph neural architectures) are open areas for enhancing predictive power. Lastly, comprehensive comparison against LLM-based retrieval systems (which treat artist identities via text descriptions rather than catalog context) is warranted, especially regarding scalability and interpretability.
Conclusion
This paper rigorously demonstrates that leveraging artist catalog information provides substantial, quantifiable gains for cold-start music recommendation. The ACARec model formalizes and operationalizes artist-aware attention, advancing both the theoretical and practical state-of-the-art. As music recommendation continues to evolve, explicit modeling of artist-track hierarchies is likely to become fundamental for cold-item personalization, artist discovery, and overall user satisfaction.