Cross-Network Matrix Factorization (xNetMF)

Updated 15 September 2025

xNetMF is a technique that extends traditional matrix factorization to jointly encode multiple network domains using both structural and auxiliary data.
It leverages high-order proximities via random walk sampling to capture multi-hop relationships, enhancing node representation and prediction accuracy.
The framework uses consensus constraints and scalable spectral sparsification, enabling effective cross-domain recommendations and robust link prediction.

Cross-Network Matrix Factorization (xNetMF) refers to a class of methodologies that extend matrix factorization (MF) techniques to jointly encode relational and auxiliary information across multiple network domains. These methods are motivated by the need to learn universal or transferable node representations that exploit multi-hop structures and auxiliary signals (content, labels, user overlaps) available in disparate but related networks. By integrating explicit matrix modeling, high-order proximity, and consensus constraints or propagation kernels, xNetMF provides a principled approach for cross-network embedding, recommendation, and link prediction.

1. Fundamental Principles and Mathematical Formulation

Cross-network MF starts from the canonical formulation of single-network MF but generalizes the objective to simultaneously factorize multiple networks or domains. Given a set of graphs $\{G^{(i)} = (V^{(i)}, E^{(i)})\}_{i=1}^N$ , and associated node features or interaction data, xNetMF requires the construction of co-occurrence or interaction matrices $D^{(i)}$ for each network. These matrices encode not only direct links but also high-order proximities through aggregation over random walk sequences or powers of the transition matrix:

$S^{l,(i)} = \sum_{k=1}^{l} P^{(i), k}$

where $P^{(i)}$ denotes the normalized adjacency for network $i$ . The multi-network objective is then to find shared or coupled latent factors $W, S, Z$ such that

$\min_{W, S, Z} \sum_{i=1}^N MF(D^{(i)}, F^{(i)T} S W^{(i)}) + g(Z)$

subject to consensus or cross-domain constraints, such as $W^{(i)} = Z$ for shared nodes. The reconstruction loss $MF(.,.)$ is often derived from generalized Skip-Gram Negative Sampling (SGNS), enabling closed-form solutions for negative sampling expectations via sigmoid activations.

2. High-Order Proximity and Random Walk Sampling

A cornerstone of xNetMF, as established in "Enhancing Network Embedding with Auxiliary Information: An Explicit Matrix Factorization Perspective" (Guo et al., 2017), is the use of matrices that reflect not only first-order (direct) network connectivity but also high-order (multi-hop) relationships. This is formalized by constructing a co-occurrence matrix $D$ where $(i, j)$ entries count the frequency with which $v_i$ and $v_j$ co-occur within a window size $l$ during sampled random walks (Algorithm A0):

$l \cdot D^{nor} = S^{l}$

where $D^{nor}$ is the row-normalized $D$ , and $S^l$ is the sum of powers of $P$ up to $l$ . Theoretical results show that repeated random walks yield $D^{nor}$ approximating the rooted PageRank matrix with bounded $\ell_2$ error, capturing both local and global proximities.

This property is further generalized in frameworks such as "Just Propagate: Unifying Matrix Factorization, Network Embedding, and LightGCN for Link Prediction" (Liu, 2024), where model updates are written as:

$X^{(m+1)} = H^{(m+1)} X^{(m)} = \{ c_1 I + c_2 \mathcal{P}_{a_1,b_1}(\tilde{A}) [ \mathcal{K}_+^{(m+1)}(A) - \lambda \mathcal{K}_-^{(m+1)}(B) ] \mathcal{P}_{a_1,b_1}(\tilde{A}) \} X^{(m)}$

Here $\mathcal{P}_{a,b}$ aggregates powers of normalized adjacency to encode walks of various lengths, ensuring multi-hop context is integrated.

3. Incorporation of Content and Label Information

xNetMF frameworks jointly factorize structural and auxiliary content matrices. Nodes are associated with feature vectors $f_c$ comprising $F$ (node features or item content), and the embedding is learned as $F^T S W$ . The integration of label information is enabled by modifying the sampling procedure: pairs of similarly labeled nodes receive boosted co-occurrence counts, directly enhancing intra-class similarity within $D$ (Algorithm A).

Objective functions are defined as:

$L(W,S) = -\sum_i \sum_c \log P(d_{i,c} \mid f_c^T S w_i)$

where probabilities are parameterized by an upper bound $Q_{i,c}$ and negative sampling ratio $k$ , leveraging the equivalence between SGNS and explicit MF (closed-form expectation for negatives via $Q_{i,c} \cdot \sigma(f_c^T S w_i)$ ).

Recent advances in cross-domain recommendation, such as CDIMF ("Cross-Domain Latent Factors Sharing via Implicit Matrix Factorization" (Samra et al., 2024)), demonstrate explicit strategies for latent factor sharing. Multiple domains with their own interaction matrices $P^{(i)}$ and factors $(X^{(i)}, Y^{(i)})$ are coupled by constraints enforcing identical embeddings for shared entities:

$\min \sum_{i=1}^N F_i(X^{(i)}, Y^{(i)}) \quad \text{subject to } X^{(1)} = X^{(2)} = \cdots = X^{(N)}$

To solve this efficiently, Alternating Direction Method of Multipliers (ADMM) is deployed, decomposing updates to local domains—solved via ALS—while iteratively enforcing the consensus constraint by proximal averaging of latent variables. This approach is computationally tractable ( $O(N d |U|)$ for aggregation), scales well, and supports privacy-aware distributed learning.

5. Unified Propagation Kernel Perspective

The kernel-based propagation view, as analyzed in (Liu, 2024), provides a meta-framework for interpreting xNetMF. Rather than treating MF, network embedding, and Graph Neural Networks (GNNs) as isolated, this perspective applies iterative propagation of node representations via a combination of identity, adjacency, and high-order kernels (positive/negative link kernels):

$X^{(m+1)} = \{ c_1 I + c_2 \sum_{i} w_i \mathcal{P}_{a_1,b_1}(\tilde{A}^{(i)}) [ \mathcal{K}_+^{(m+1)}(A^{(i)}) - \lambda \mathcal{K}_-^{(m+1)}(B^{(i)}) ] \mathcal{P}_{a_1,b_1}(\tilde{A}^{(i)}) \} X^{(m)}$

with proper weighting $w_i$ . This formulation facilitates cross-network aggregation, balancing positive signal propagation and negative sampling for regularization. High-order proximity aggregation, normalization, and kernel structure are shown empirically and theoretically to govern embedding expressiveness and stability.

6. Scalability, Efficiency, and Empirical Performance

Scalable explicit factorization is enabled via spectral sparsification, as developed in "NetSMF: Large-Scale Network Embedding as Sparse Matrix Factorization" (Qiu et al., 2019). Instead of dense co-occurrence matrices ( $O(n^2)$ nonzeros), NetSMF constructs sparse proxies ( $O(n \log n)$ ) that are (1+ $\varepsilon$ )-spectrally similar to the dense version, using randomized path sampling algorithms. Downstream embedding is performed via truncated randomized SVD.

Empirical evaluation on tasks such as semi-supervised node classification and link prediction (e.g., datasets Citeseer, Cora, Pubmed, Facebook) demonstrates pronounced gains from joint factorization of structure and content. Reported metrics include classification accuracy (up to 81.5%), AUC (up to 0.956), and MAP, consistently outperforming DeepWalk, node2vec, TADW, HSCA, and GCN. In cross-domain scenarios, models like CDIMF yield competitive NDCG and Hit Ratio, especially boosting cold-start performance.

7. Applications and Extensions

xNetMF models are applicable in domains including:

Semi-supervised and unsupervised node classification, leveraging auxiliary labels and content.
Link prediction, benefiting from multi-hop proximity aggregation and content-feature factorization.
Knowledge transfer across networks or recommender domains (cross-domain recommendation), with shared latent factors improving cold-start and coverage.
Scalable embedding in large-scale networks, enabled by sparsification and distributed consensus algorithms.

A plausible implication is that future xNetMF frameworks may further integrate propagation kernels across heterogeneous networks, with adaptive weighting and negative sampling strategies, as outlined by unified update rules.

In summary, Cross-Network Matrix Factorization unifies multi-network embedding by explicit modeling of high-order proximities, auxiliary content and label integration, and consensus-driven latent sharing, all made scalable via recent advances in spectral sparsification and propagation kernels. Its design bridges matrix factorization, network embedding, and GNNs, providing a flexible foundation for transfer learning and predictive analytics in multi-relational settings.