xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems (1803.05170v3)

Published 14 Mar 2018 in cs.LG and cs.IR

Abstract: Combinatorial features are essential for the success of many commercial models. Manually crafting these features usually comes with high cost due to the variety, volume and velocity of raw data in web-scale systems. Factorization based models, which measure interactions in terms of vector product, can learn patterns of combinatorial features automatically and generalize to unseen features as well. With the great success of deep neural networks (DNNs) in various fields, recently researchers have proposed several DNN-based factorization model to learn both low- and high-order feature interactions. Despite the powerful ability of learning an arbitrary function from data, plain DNNs generate feature interactions implicitly and at the bit-wise level. In this paper, we propose a novel Compressed Interaction Network (CIN), which aims to generate feature interactions in an explicit fashion and at the vector-wise level. We show that the CIN share some functionalities with convolutional neural networks (CNNs) and recurrent neural networks (RNNs). We further combine a CIN and a classical DNN into one unified model, and named this new model eXtreme Deep Factorization Machine (xDeepFM). On one hand, the xDeepFM is able to learn certain bounded-degree feature interactions explicitly; on the other hand, it can learn arbitrary low- and high-order feature interactions implicitly. We conduct comprehensive experiments on three real-world datasets. Our results demonstrate that xDeepFM outperforms state-of-the-art models. We have released the source code of xDeepFM at \url{https://github.com/Leavingseason/xDeepFM}.

Citations (970)

View on Semantic Scholar

Summary

The paper introduces xDeepFM, a hybrid model that combines explicit and implicit feature interactions to enhance recommender system performance.
It employs a novel Compressed Interaction Network (CIN) to explicitly learn high-order vector-wise feature interactions, reducing the need for manual feature engineering.
Empirical results on datasets like Criteo, Dianping, and Bing News demonstrate xDeepFM’s superior AUC and lower Logloss compared to traditional models.

Combining Explicit and Implicit Feature Interactions for Recommender Systems

The paper "xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems" introduces a novel approach for learning feature interactions in recommender systems. The proposed model, eXtreme Deep Factorization Machine (xDeepFM), integrates both explicit and implicit high-order feature interactions, which is particularly advantageous in reducing the necessity for manual feature engineering. This essay provides an analytical overview of the contributions, methodologies, and implications of the xDeepFM model.

High-Order Feature Interactions in Recommender Systems

Effective recommendation systems often require leveraging high-order feature interactions among voluminous and diverse data points, particularly in large-scale systems. Traditional methods such as Factorization Machines (FM) can learn second-order interactions but are limited when addressing higher-order feature interactions. Deep neural networks (DNNs) have been adapted in various models to capture these high-order interactions automatically. However, they tend to learn in an implicit and bit-wise fashion which poses limitations in interpretability and efficiency.

Compressed Interaction Network (CIN)

The cornerstone of xDeepFM is the introduction of the Compressed Interaction Network (CIN). Unlike traditional fully-connected layers in DNNs, CIN explicitly models feature interactions at a vector-wise level. The model structure is inspired by convolutional and recurrent neural network architectures where interactions grow in complexity with the depth of the network. CIN achieves this by leveraging layer-wise outer products of feature embeddings, resulting in a network that effectively captures bounded-degree interactions without exponential complexity growth.

The efficacy of CIN is evidenced by its performance on three real-world datasets: Criteo, Dianping, and Bing News. Across these datasets, CIN alone outperforms several traditional and neural network-based models, showcasing its capacity to learn meaningful high-order feature interactions explicitly.

xDeepFM Model

xDeepFM combines CIN with a DNN component to benefit from both explicit and implicit feature interaction learning. This hybrid architecture ensures that the model is capable of capturing arbitrary feature interactions not just limited to the vector-wise interactions learned by CIN but also including bit-wise interactions implicitly learned by DNNs. The model architecture can thus address richer levels of data complexity and variability inherent in practical datasets.

Empirical Evaluation

Comprehensive experiments were conducted on the three datasets mentioned earlier. xDeepFM consistently demonstrated superior performance by achieving higher AUC and lower Logloss compared to baseline models including wide and deep neural networks, traditional FMs, and other hybrid architectures like DeepFM and DCN. These results were robust across various settings of network depth and neuron counts, indicating the stability and scalability of the proposed method.

Implications and Future Work

The successful integration of explicit and implicit learning mechanisms in xDeepFM has several implications:

Reduction in Manual Feature Engineering: By effectively automating the learning of complex feature interactions, xDeepFM significantly reduces the need for labor-intensive manual feature engineering, accelerating model deployment in real-world systems.
Scalability and Efficiency: Despite the high computational requirements of the CIN component, the model's design ensures that it remains computationally feasible, which is critical for web-scale applications.

Further development can explore enhancing the interaction learning through advanced mechanisms like attentive activations and distributed training frameworks to handle even larger datasets and more complex interactions efficiently.

Conclusion

The xDeepFM model represents a meaningful advancement in the field of recommender systems by synergizing explicit and implicit feature interaction learning. Its innovative use of the Compressed Interaction Network (CIN) allows for scalable and interpretable high-order feature interactions, leading to significant improvements in recommendation accuracy. Future research directions will likely focus on optimizing computational efficiency and further refining the interaction learning capabilities to keep pace with the growing complexity of real-world data environments.

PDF Markdown

Related Papers

GitHub

GitHub - Leavingseason/xDeepFM (747 stars)