Multi-Interest Recommendation Systems

Updated 19 October 2025

Multi-interest recommendation systems are defined by extracting multiple embeddings to capture the full spectrum of diverse user behaviors.
They employ methods such as dynamic routing, attention-based extraction, and graph neural networks to cluster and aggregate user actions into distinct interest vectors.
Industrial applications demonstrate significant improvements in metrics like HitRate and Recall, proving the practical value of multi-interest models in real-world deployments.

A multi-interest recommendation system is designed to capture the complex, multifaceted nature of user preferences by extracting and maintaining multiple distinct representations (or embeddings) per user, rather than a single vector or profile. Such systems have emerged in response to the observation that individual users interact with items across heterogeneous categories, contexts, or topics, and that a single-vector user model is typically insufficient for both prediction accuracy and diversity in large-scale recommendation scenarios. Multi-interest modeling aims to disentangle these latent interests, enable fine-grained matching, and support robust, scalable retrieval and ranking in both industrial and academic settings.

1. Rationale and Foundations

The core motivation for multi-interest modeling is the diversity and volatility of user behavior, as well as the rich multi-faceted attributes of items in modern recommendation platforms (Li et al., 18 Jun 2025). Classical single-vector models—where a user's interactions are compressed into one dense representation—are inadequate for users whose preferences span disconnected categories or rapidly shift across contexts. For example, a user may simultaneously shop for electronics, sports apparel, and groceries within the same platform; compressing this behavior into a single vector invariably leads to information loss and poor candidate retrieval. Multi-interest models attempt to learn several user embeddings (often termed “interest capsules,” “interest vectors,” or “virtual interests”), each encoding a distinct underlying preference. This approach not only improves relevance (precision) but also helps ensure recommendation diversity and sometimes fairness, as it mitigates the bias against users with heterogeneous tastes (Zhao et al., 21 Feb 2024).

An additional appeal of this modeling paradigm stems from its ability to support explainability (by linking recommendations to specific latent interests) as well as improvements in retrieval scalability and robustness in production systems (Li et al., 2019, Cen et al., 2020, Meng et al., 14 Jul 2025).

2. Extraction and Aggregation Methodologies

The extraction of multiple interest representations from a user’s behavior sequence is a defining characteristic of multi-interest recommendation systems. The literature enumerates several representative extraction and aggregation modules:

Dynamic Routing (Capsule Networks): Inspired by Sabour et al.'s capsule networks, dynamic routing is frequently used to softly cluster historical user behaviors (viewed as “behavior capsules”) into K interest “capsules” via an iterative routing process (Li et al., 2019, Cen et al., 2020). The assignment coefficients are computed over routing logits $b_{ij}$ —often initialized randomly to avoid collapsed solutions—and updated via softmax and agreement signals, producing interest embeddings $\{h_j\}_{j=1}^K$ after a squash nonlinearity:

$h_j = squash(s_j) = \frac{\|s_j\|^2}{1 + \|s_j\|^2} \frac{s_j}{\|s_j\|}$

where $s_j = \sum_i c_{ij} W_{ij} x_i$ .

Attention-Based Extraction: Multi-head attention and self-attention mechanisms are used to learn “soft clusters” of behaviors, where each head or attended context corresponds to a latent interest (Cen et al., 2020, Chen et al., 2021). Given a behavior matrix $H$ , interest vectors are extracted as:

$V_u = H \cdot \operatorname{softmax}(W_2^T \tanh(W_1 H))$

Graph-Based Approaches: Recent works inject graph neural network (GNN) modules to capture not only sequential but also higher-order and global dependencies among historical items (Chen et al., 2021, Tian et al., 2022). Node embeddings in a user behavior graph can be aggregated over multiple layers and then routed or clustered into interest vectors.
Aggregation with Label-Aware or Controllable Attention: Once interest embeddings are extracted, candidate retrieval and final scoring require flexible aggregation. Label-aware attention compares the candidate item embedding to each user interest and dynamically selects or weights relevant interests (Li et al., 2019, Zhang et al., 2022). Some systems employ a tunable “hardness” parameter to interpolate between soft and hard selection.
Diversity-Accuracy Control: Building on the basic max-pooling or weighted-sum approaches, some frameworks (e.g., ComiRec) introduce a controllable factor $\lambda$ to balance the trade-off between accuracy (matching score) and recommendation diversity:

$Q(u, \mathcal{S}) = \sum_{i \in \mathcal{S}} f(u, i) + \lambda \sum_{i, j \in \mathcal{S}} g(i, j)$

where $g(i, j)$ encodes inter-item category dissimilarity (Cen et al., 2020).

Table 1 below outlines several core mechanisms of extraction and aggregation:

Extraction Method	Aggregation Strategy	Distinguishing Feature
Dynamic routing (capsule)	Label-aware attention, max pooling	Iterative clustering, squash function
Self-attention	Attention-based fusion, weighted-sum	Soft clustering, learnable attention weights
Graph convolution	Per-level pooling, max-pooling over layers	Multi-grained, captures global and local structure
Prompt-based adaptation	Mean+variance fusion, prompt attention	Adaptive prompts, centrality-dispersion encoding

3. Technical and Theoretical Advances

Several modeling challenges are fundamental to multi-interest systems:

Interest Collapse: Without explicit regularization (e.g., contrastive losses (Zhang et al., 2022), quantization (Wu et al., 16 Oct 2025)), interest vectors often drift towards redundancy (collapse), reducing the diversity of user representation. Some models (GemiRec (Wu et al., 16 Oct 2025)) deploy vector quantization to ensure strict separation among interests. The theoretical benefit is often formalized by showing that quantization induces hard partitions (Voronoi cells) with provable lower bounds on inter-interest distances.
Modeling Interest Evolution: Static extraction is insufficient for emerging or future interests not present in historical behaviors. Generative modules, such as interest generators based on transformers or GPT-like architectures, are incorporated to explicitly predict the next likely interest given the user’s trajectory and side information (Wu et al., 16 Oct 2025, Le et al., 8 Feb 2025).
Hierarchical and Category-Aware Structuring: Several approaches structure the set of interests hierarchically, either by recursively clustering behaviors into a tree (Pei et al., 2 Feb 2024) or by partitioning the representation space by item category, enabling fine-grained matching and efficient search (Meng et al., 14 Jul 2025).
Integration with LLMs: Emerging work combines LLMs for semantic labeling and fine-tuning of interest extraction, either in the auxiliary training phase (Qiao et al., 14 Nov 2024) or through hybrid architectures that fuse long-term behavioral and short-term semantic signals (Zhou et al., 15 Oct 2025).

4. Empirical Results and Applications

Multi-interest models consistently outperform both classical single-vector systems and naïve multi-vector baselines in standard offline metrics (Recall@N, NDCG, Hit Rate) and real-world deployment:

On TmallData, MIND achieved up to 65% improvement in HitRate over YouTube DNN (Li et al., 2019).
ComiRec-SA and ComiRec-DR improved Recall@50 by up to 8.65% over MIND in Alibaba's distributed cloud platform (Cen et al., 2020).
Models such as DMI (Diffusion Multi-Interest) demonstrate >14% Recall@20 gain over prior SOTA and deliver tangible click-through, diversity, and engagement improvements for hundreds of millions of daily active users (Le et al., 8 Feb 2025).
In online A/B tests, industrial deployments have documented substantial increments in GMV (e.g., 2.5% lift at Taobao via HCN (Yuan et al., 2022); 4.03% at Taobaomiaosha with ULIM (Meng et al., 14 Jul 2025)), advertiser engagement, and cold-start performance (Zhou et al., 15 Oct 2025).

Applications span e-commerce (Tmall, Taobao, Lofter, Douyin), news (MIND dataset), micro-video sharing, check-in services, and educational platforms—where user interests are inherently diverse and time-evolving (Li et al., 18 Jun 2025).

5. Modeling Improvements: Robustness, Scalability, and Diversity

Research has focused on several methodological axes:

Scalability: Multi-interest models must scale to industrial settings with billions of users and items. Design patterns such as efficient dual-tower architectures, inner product retrieval, residual codebooks for history compression (Zhou et al., 15 Oct 2025), and parallelized nearest-neighbor search enable real-time candidate matching at scale.
Efficiency and Latency: Partitioning retrieval pools by hierarchical interest or category, as in RimiRec (Pei et al., 2 Feb 2024) or ULIM (Meng et al., 14 Jul 2025), reduces candidate pool size and accelerates matching. Systems like Trinity (Yan et al., 5 Feb 2024) aggregate historical behavior into statistical histograms for efficient retrieval in large-scale online platforms.
Interest Diversity and Fairness: To address bias against users with diverse tastes, some frameworks explicitly monitor or penalize the similarity between per-user interest vectors (e.g., via average mutual redundancy (AMR) or contrastive regularization) (Zhang et al., 2022, Wu et al., 16 Oct 2025). Additional fairness-aware formulations balance utility with equity so that users with high interest diversity receive recommendations of comparable quality (Zhao et al., 21 Feb 2024).
Explainability: The inherent structure of multi-interest representations (per-capsule, per-category, or per-latent cluster) allows for improved post-hoc or model-based explainability, as specific interest vectors can often be mapped to recognizable user behaviors or item groups (Li et al., 18 Jun 2025).

6. Open Challenges and Future Research

Ongoing challenges and proposed directions within multi-interest recommendation research include:

Adaptive Interest Extraction: Moving beyond a user-independent, fixed number of interests ( $K$ ), there is active interest in dynamically determining the optimal number or structure of user interests via density-based clustering or silhouette analysis (Li et al., 18 Jun 2025).
Avoiding Representation Collapse: Empirical and theoretical studies elucidate that hard quantization and generative models (as in GemiRec (Wu et al., 16 Oct 2025)) are more robust than soft regularization alone in preventing collapse of learned interests.
Leveraging Multi-Modal and Semantic Information: The integration of textual, visual, and categorical item features, often through LLM-based semantic modules, offers a promising path to reinforce multi-interest discovery and disambiguation (Qiao et al., 14 Nov 2024, Zhou et al., 15 Oct 2025).
Connecting Retrieval and Ranking: Bridging the technical divide between matching (retrieval) and ranking stages, especially by aligning interest extraction methodologies and leveraging cross-stage representations, remains a fruitful research focus (Meng et al., 14 Jul 2025).
Efficient Denoising and Diffusion Methods: Interest denoising via diffusion processes at the embedding or dimension level is a recent trend with promising results in enhancing granularity and discrimination of user representations (Le et al., 8 Feb 2025).
Explainability and Fairness: Further work is needed on interpretable routing of recommendations to interest representations and ensuring equitable experience across varying user profiles, particularly in large, category-rich datasets (Zhao et al., 21 Feb 2024, Pei et al., 2 Feb 2024).

7. Industrial Deployment and Impact

Multi-interest recommendation systems have been widely deployed in major production environments:

Tmall & Taobao (Alibaba): MIND (Li et al., 2019), ComiRec (Cen et al., 2020), ULIM (Meng et al., 14 Jul 2025), and related models handle major online traffic, support fast (<15ms) candidate retrieval among billions of items, and are critical to homepage and search ranking.
Douyin (TikTok): Trinity (Yan et al., 5 Feb 2024) and related systems drive multi-interest, long-tail, and long-term retrieval for billions of videos, with personalized statistics-based retrievers achieving user retention and engagement gains.
Lofter, Rednote: Hierarchical and quantized multi-interest systems (RimiRec (Pei et al., 2 Feb 2024), GemiRec (Wu et al., 16 Oct 2025)) deliver substantial improvements in user engagement and content diversity, with structural innovations such as the Interest Dictionary enabling industrial-scale, real-time multi-interest serving with minimal overhead.

The continued evolution of multi-interest modeling—spanning theoretical advances, scalable engineering, and integration of large generative or semantic models—suggests that multi-interest recommendation will remain both a core academic topic and a central pillar of operational recommender systems.