Cross-Domain Recommendation Strategies

Updated 2 May 2026

Cross-Domain Recommendation is a set of techniques that transfers user and item preference signals across diverse domains to address cold-start and data sparsity issues.
It employs mapping-based, alignment-fusion, and universal-transfer paradigms to integrate heterogeneous interactions and improve recommendation accuracy.
Recent advances demonstrate significant gains using graph neural networks, hyperbolic embeddings, and causality-aware methods to optimize cross-domain performance.

Cross-Domain Recommendation (CDR) refers to the class of recommendation methodologies that aim to transfer user or item preference signals across multiple, often heterogeneous, domains to mitigate data sparsity, cold-start issues, and domain isolation effects that limit the performance of conventional recommender systems. CDR models leverage auxiliary knowledge from source domains—possibly via shared users, items, or complex external features—to enhance prediction accuracy in a target domain or simultaneously across all participating domains. The field encompasses a diverse set of formalizations, methodological innovations (including alignment, mapping, dual learning, and manifold-based approaches), and a variety of application scenarios ranging from e-commerce to industrial-scale online advertising.

1. Formal Definitions, Taxonomy, and Problem Settings

The canonical CDR setup involves two or more domains, each with its own set of users, items, and observed interactions. Let $D^A = \{\mathcal U^A, \mathcal I^A, R^A\}$ and $D^B = \{\mathcal U^B, \mathcal I^B, R^B\}$ , where the interaction matrices $R^A$ and $R^B$ are typically incomplete and highly sparse. Scenarios are differentiated along user and item overlap axes: from no overlap, partial overlap (shared users or items), to full overlap. Tasks are classified by whether recommendations are sought intra-domain (within-domain), inter-domain (cross-domain or cold-start), or multi-target (all domains improved jointly) (Zang et al., 2021).

A structured overview of typical CDR scenarios is shown below:

Scenario	User Overlap	Item Overlap	Example Methods
Non-overlap	none	none	CBT, RMGM, PCLF
Partial-user	partial	none	Collective MF, EMCDR, DML
Full-user	full	none	CMF, CoNet, DDTCDR
Partial/mixed	partial	partial/full	Advanced graph & tensor models

Critical tasks include inter-domain cold-start recommendation for new users (no activity in target), data-scarce domains (few interactions), and universal CDR optimizing for all domains in arbitrary overlap regimes (Zang et al., 2021, Cao et al., 2024).

2. Methodological Foundations: Mapping, Alignment, and Fusion

The methodological core of CDR divides into mapping-based, alignment-fusion, and universal-transfer paradigms.

Mapping-Based Approaches: These train neural or linear mappings $f_\theta$ to align feature spaces of users (and sometimes items) observed in overlapping sets. For example, EMCDR and DCDCSR first obtain domain-specific embeddings via matrix factorization, then learn $f_\theta$ on overlapping users to map source-domain user factors to the target domain, predicting for cold-start users by mapping their source profile (Zhu et al., 2020). Sharpness-aware mapping (SCDR) adds a local-maximization term to encourage flat minima and generalization, with formal PAC-Bayes guarantees (Zeng et al., 2024). Orthogonal constraints can dramatically reduce the required number of overlaps, as in Dual Metric Learning (DML), which enforces $X X^T = I$ for the mapping and couples mono-domain and cross-domain loss terms, facilitating two-way knowledge transfer (Li et al., 2021).

Alignment and Fusion: When user or item overlap is partial, feature aggregation across domains is performed via attention, pooling, or explicit contrastive mechanisms. CAT-ART generates a global user embedding via a contrastive autoencoder trained on all domains, then fuses domain-specific and attentive-transferred representations per domain, robustly controlling negative transfer (Li et al., 2022). COAST constructs a unified cross-domain graph and defines cross-domain message passing and two fine-grained interest alignment losses (user-user and user-item), enforcing high-order cross-domain similarity and mutual gradient alignment (Zhao et al., 2023).

At a higher abstraction, universal CDR models like UniCDR+ disentangle domain-shared and domain-specific representations, with multi-behavior, tower-based prediction heads and contrastive objectives to isolate common signals from domain-private noise. The approach is generalizable to arbitrary domain/task arrangements (user overlap, item overlap, sequential, static) and industrial deployment (Cao et al., 2024).

3. Graph Signal, Topological, and Geometric Models

Advanced CDR models exploit the underlying graph or manifold structure of domains.

Graph Neural Networks (GNNs) and Graph Signal Processing (GSP): Cross-domain GNNs augment classic bipartite message passing with cross-graph interactions (e.g., unified graphs with overlapping users, heterogeneous item nodes) (Zhao et al., 2023). CGSP frames CDR as processing on a cross-domain similarity graph, constructing a weighted combination of intra- and inter-domain item similarities, with closed-form graph filtering that eliminates the need for explicit training and is robust to user-overlap (Lee et al., 2024).

Motif and Topology-Based Universal CDR: MOP captures domain-agnostic signals by sampling triads and bicliques ("motifs") that recur across domains, encoding shared topology using a hybrid of hypergraph convolution and expert Transformers. Prompt-tuning enables domain adaptation without negative transfer, explicitly unifying pre-training and recommendation under a motif similarity learning loss (Hao et al., 2023).

Hyperbolic Representation and Manifold Alignment: To address long-tail and hierarchical structures exacerbated by domain aggregation, hyperbolic methods like HCTS separately embed each domain on a distinct manifold, perform domain-specific GNN encoding, and transfer knowledge via tangent-space alignment and multi-type hyperbolic contrastive learning—achieving significant gains on both head and tail items (Yang et al., 2024).

4. Cross-Domain Sequential Recommendation

Cross-domain sequential recommendation (CDSR) generalizes standard CDR to harness time-ordered interactions across domains. The problem is formally articulated via a 4-dimensional tensor $\Gamma \in \mathbb{R}^{U \times I \times T \times D}$ , with learning objectives including predicting the next-item in any domain, given cross-domain histories (Chen et al., 2024).

Recent methods address the challenge of minimal user overlap (the dominant regime in real applications). For example, CDSRNP formulates the problem as a Neural Process (NP), using the few overlapped users as a support set to learn a cross-domain latent prior, and explicitly aligning non-overlapped users to this prior via variational inference and cross-attention between user sequence embeddings, substantially improving accuracy even under extremely low overlap (Li et al., 2024). Foundation-model based architectures such as X-Cross dynamically integrate layerwise representations from several LoRA-adapted domain-specific LLMs, achieving parameter and data efficiency superior to naive fine-tuning baselines (Hadad et al., 29 Apr 2025).

5. Causality, Fairness, and Negative Transfer

Theoretical and algorithmic advances have increasingly prioritized reducing spurious transfer and ensuring fairness.

Causality Enhancement: CE-CDR introduces a formal causal graphical model of cross-domain recommendation, arguing that only a subset of source behaviors causally influence target engagement. By constructing a causality-aware partial label dataset heuristically (leveraging content and behavioral similarity), and training on an unbiased Partial Label Causal Loss combined with a learned propensity model, CE-CDR enhances transferability and is validated both offline and at production scale (Wu et al., 16 Oct 2025).

Negative Transfer Control: Multiple universal and alignment-based methods diagnose and suppress negative transfer (performance degradation from non-informative or noisy source signals) via contrastive loss, attention-based representation gating, or prompt tuning (Li et al., 2022, Hao et al., 2023). Domain-aware cross-attention explicitly focuses only on the subset of source behaviors most predictive for the target, enabling industrial fine-tuning in new domains with minimal retraining (Luo et al., 2024).

Fairness: Fairness-aware models such as FairCDR adjust mapping functions to achieve user-oriented group fairness in the target domain, employing data reweighting based on influence functions, and leveraging non-overlapping data to mitigate representation bias (Tang et al., 2023).

6. Experimental Validation, Applications, and Datasets

CDR has been extensively benchmarked on public datasets (Amazon, Douban, MovieLens, Netflix, Book-Crossing, etc.) and deployed in large-scale production environments (JD.com, Kuaishou Living Room, major online advertising platforms).

Key experimental findings include:

Dual metric learning and orthogonal mapping approaches can surpass prior state-of-the-art with as few as 8–16 overlap users, reducing the dependence on overlap by orders of magnitude (Li et al., 2021).
Hyperbolic contrastive learning provides up to 5–15% gains in retrieval metrics over both classic and deep Euclidean CDR baselines, with especially large effects in the long-tail (Yang et al., 2024).
Motif-based prompt frameworks yield consistent improvements (+1–2% HR@10, NDCG@10) over multi-task learning or plain pretrain-finetune (Hao et al., 2023).
Cross-domain sequential models leveraging neural process or dynamic LLM integration attain both superior accuracy and vastly better data/parameter efficiency compared to prior CDSR and transfer learning models (Li et al., 2024, Hadad et al., 29 Apr 2025).
Causality-aware recommendations produce up to 10% relative accuracy gains, and have been adopted at industrial scale without increasing inference latency (Wu et al., 16 Oct 2025).
Cold-start and sparsity robustness is observed across all method classes via explicit transfer (mapping) or alignment of domain-shared representations.

A representative subset of public datasets commonly used is shown below:

Dataset	Domains	Overlap Type	Interaction Type
Amazon	Books, Movies, Music	Partial user/item	Ratings, reviews
Douban	Movies, Music, Books	Partial user	Ratings, reviews
MovieLens	Movie genres	None, synthetic	Ratings
Book-Xing	Books	None	Ratings
Netflix	Movies	None, synthetic	Ratings
Industrial	Ads, short-video etc.	High user overlap	Implicit, multi-action

7. Open Challenges and Future Directions

Current and emerging challenges in the field include:

Scalability & Efficiency: Enabling CDR models to scale efficiently to billions of interactions and multi-domain graphs, often necessitating approximation, sampling, or modular training (Zang et al., 2021).
Heterogeneous and Non-overlapping Settings: Extending transfer-learning and alignment methods to settings with no user or item overlap, relying on semantic, contextual, or external KG-based bridges (Zhang et al., 2022).
Multi-Domain and Universal CDR: Developing frameworks that unify transfer across arbitrary numbers of domains and recommendation tasks, supporting both intra- and inter-domain, sequential or static interactions (Cao et al., 2024, Hao et al., 2023).
Fairness, Privacy, and Interpretability: Ensuring that cross-domain transfer remains fair, immune to bias amplification, private to user-sensitive information, and interpretable at scale (Tang et al., 2023, Zang et al., 2021).
Hybrid and Manifold Methods: Combining hyperbolic, Euclidean, and possibly spherical representations for universal item/user modeling. Learnable curvature, dynamic geometry adaptation, and multimodal extensions are promising avenues (Yang et al., 2024).
Integration of Large Foundation Models: Efficiently leveraging large-scale (language, vision, graph) foundational models for universal cross-domain recommendation, with prompt/adapter-based adaptation (Hadad et al., 29 Apr 2025, Hao et al., 2023).
Robustness and Causal Guarantees: Developing causal inference frameworks, negative transfer diagnostics, and theoretically justified objectives to ensure robust knowledge transfer even when auxiliary domains differ substantially (Wu et al., 16 Oct 2025).

Progress in these areas will drive further improvement in leveraging heterogeneous, sparse, and dynamic user–item data across real-world multi-domain platforms.