FourierKAN-GCF: Graph Collaborative Filtering
- Graph Collaborative Filtering (FourierKAN-GCF) is a graph-based recommendation architecture that employs Fourier-parameterized, nonlinear transformations to enhance user–item interactions.
- It balances the expressive power of neural models with the simplicity of aggregation, leveraging the Kolmogorov–Arnold theorem for universal approximation.
- Empirical results show state-of-the-art performance on benchmarks like MOOC and Amazon Games, with improved Recall@20 and NDCG@20 metrics.
Graph Collaborative Filtering (FourierKAN-GCF) denotes a class of graph-based recommendation architectures that integrate nonlinear feature transformations based on the Kolmogorov–Arnold theorem, instantiated via a Fourier parameterization, into the message passing process of graph collaborative filtering models. This design balances the representational capacity of earlier neural graph collaborative filtering approaches with the stability and simplicity of aggregation-centric models. FourierKAN-GCF achieves state-of-the-art accuracy and robustness on collaborative filtering tasks for large-scale, sparse, user–item implicit-feedback graphs (Xu et al., 2024).
1. Foundations and Theoretical Motivation
Graph collaborative filtering (GCF) models exploit the topology of the user–item bipartite interaction graph to propagate user and item embeddings across local neighborhoods. Early approaches such as NGCF (Neural Graph Collaborative Filtering) conduct layer-wise message passing, where both self-feature transformations and explicit interactions (such as element-wise user–item embeddings multiplications passed through a multilayer perceptron) are included in each layer’s update. LightGCN removes both the feature transformation (weight matrix) and nonlinear activation, retaining only normalized neighbor aggregation. Empirical ablation studies have demonstrated that while removing the self-feature transform (matrix ) does not degrade accuracy, eliminating the explicit interaction transform (matrix ) or the interaction term diminishes performance, underscoring the value of nonlinear interaction modeling (Xu et al., 2024).
Kolmogorov–Arnold Networks (KAN) offer a theoretical foundation for learning general nonlinear transformations. The Kolmogorov–Arnold theorem assures that any continuous multivariate function can be decomposed as a sum of univariate functions, affording a universal approximation mechanism. Embedding a lightweight, expressive interaction function into GCF presents a principled method to reintroduce nonlinearity without the overparameterization or optimization instability associated with standard feedforward MLPs.
2. Fourier KAN Interaction Features
The distinctive component of FourierKAN-GCF is the use of a Fourier-parameterized KAN (“Fourier KAN”) to realize the nonlinear interaction transform in message passing. In contrast to spline-parameterized KANs (e.g., employing B-splines), the Fourier KAN implements each univariate function as a finite Fourier series: where is the element-wise multiplication of the user and item embeddings (), is the grid size (number of harmonics), and are trainable parameters. This parameterization achieves universal approximation through trigonometric series while maintaining parameter efficiency and stable optimization behavior. The original MLP-based is thus replaced by (Xu et al., 2024).
3. Layer-wise Propagation and Architecture
The message passing procedure in FourierKAN-GCF modifies the NGCF/LightGCN paradigm by:
- Discarding the self-feature kernel ().
- Retaining only the interaction transform, instantiated as a single-layer Fourier KAN.
- Aggregating neighbor and interaction information with symmetric normalization.
- Applying a single nonlinearity per layer (commonly ReLU).
Formally, for user and item at layer , the updates are:
where denotes the activation function. After layers, the model concatenates the respective embeddings from each layer to generate the final user and item representations.
4. Regularization Strategies and Optimization
Robustness and generalization are enhanced through message dropout and node dropout:
- Message dropout: Each message in the neighbor aggregation is zeroed with probability , introducing stochasticity and preventing co-adaptation of edge-wise signals.
- Node dropout: Embedding vectors for a node are masked entirely with probability prior to propagation, which fortifies the model against overfitting and adversarial adjacency perturbations.
Training utilizes the Bayesian Personalized Ranking (BPR) loss typical for implicit feedback settings: where concatenates the layer-wise embeddings and comprises all leaf embedding and Fourier parameters. Optimization is performed with Adam, and regularization on Fourier coefficients (either via penalty or normalization) is critical for preventing the dominance of high-frequency components.
The per-layer computational cost is , typically less than MLP-based versions when , and memory requirements scale as for Fourier coefficients.
5. Empirical Evaluation and Comparative Results
Empirical studies on real-world interaction datasets with high sparsity (MOOC, Amazon Video Games) and established benchmarks (Recall@K, NDCG@K) demonstrate that FourierKAN-GCF outperforms strong baselines, including BPR-MF, NGCF, LightGCN, UltraGCN, and KAN-GCF. On both MOOC and Amazon Games, FourierKAN-GCF achieves the highest Recall@20 and NDCG@20. Ablation experiments confirm the benefit of both message and node dropout (+1–2% Recall@20) and indicate a consistent advantage for Fourier-based parameterization over splines (Xu et al., 2024).
| Model | MOOC R@20 | MOOC N@20 | Games R@20 | Games N@20 |
|---|---|---|---|---|
| BPR-MF | 0.3353 | 0.1898 | 0.0369 | 0.0183 |
| NGCF | 0.3361 | 0.1894 | 0.0379 | 0.0196 |
| LightGCN | 0.3307 | 0.1811 | 0.0447 | 0.0227 |
| UltraGCN | 0.3194 | 0.1962 | 0.0459 | 0.0230 |
| FourierKAN-GCF | 0.3564 | 0.2147 | 0.0473 | 0.0252 |
The model's hyperparameters demonstrate smooth, robust dependency curves, and the grid size controls expressivity (with risk of overfitting if oversized on sparse graphs).
6. Connections to Spectral and Filter-Based GCF
From a graph signal processing perspective, collaborative filtering on bipartite user–item graphs is interpreted as spectral filtering, where various methods correspond to different spectral kernels: diffusion transforms, low-pass filters, or polynomial approximations. Linear, closed-form baselines such as GF-CF and SpectralCF use spectral convolution, typically employing scalar filters of the Laplacian eigenvalues without spatial localization or nonlinear interaction modeling (Shen et al., 2021, Alshareet et al., 2023).
FourierKAN-GCF, in contrast, reintroduces nonlinearity at the interaction level via a universal Fourier-based feature transform, while the propagation scheme remains linear. A plausible implication is that FourierKAN-GCF bridges the divide between the interpretability/efficiency of spectral convolutional CF models and the interaction expressivity of neural message passing architectures (Xu et al., 2024).
7. Extensions and Practical Implications
Practical recommendations include initializing Fourier coefficients with small random perturbations, applying normalization/regularization to avoid high-frequency explosion, and preferentially reducing on extremely sparse graphs. The current instantiation uses a single Fourier layer per message; hierarchical or multi-layered Fourier KANs, as well as adaptive frequency selection strategies, remain promising future directions (Xu et al., 2024).
The FourierKAN-GCF paradigm can be extended to multi-channel or anisotropic filtering by stacking or summing feature transforms across various Laplacians (e.g., user–item, knowledge graphs, metadata co-occurrence), each with a domain-specific spectral kernel. Optimization of spectral coefficients can be conducted end-to-end or in a modular fashion, providing a pathway to interpretable, parameter-efficient hybrid recommendation systems (Shen et al., 2021).