Deep Learning over Multi-field Categorical Data: A Case Study on User Response Prediction (1601.02376v1)

Published 11 Jan 2016 in cs.LG and cs.IR

Abstract: Predicting user responses, such as click-through rate and conversion rate, are critical in many web applications including web search, personalised recommendation, and online advertising. Different from continuous raw features that we usually found in the image and audio domains, the input features in web space are always of multi-field and are mostly discrete and categorical while their dependencies are little known. Major user response prediction models have to either limit themselves to linear models or require manually building up high-order combination features. The former loses the ability of exploring feature interactions, while the latter results in a heavy computation in the large feature space. To tackle the issue, we propose two novel models using deep neural networks (DNNs) to automatically learn effective patterns from categorical feature interactions and make predictions of users' ad clicks. To get our DNNs efficiently work, we propose to leverage three feature transformation methods, i.e., factorisation machines (FMs), restricted Boltzmann machines (RBMs) and denoising auto-encoders (DAEs). This paper presents the structure of our models and their efficient training algorithms. The large-scale experiments with real-world data demonstrate that our methods work better than major state-of-the-art models.

Citations (477)

View on Semantic Scholar

Summary

The paper introduces novel DNN models that capture complex interactions in high-dimensional categorical features for improved user response prediction.
It leverages factorisation machines and sampling-based pre-training with RBM/DAE to transform sparse data into meaningful dense representations.
Experimental results on the iPinYou dataset demonstrate superior CTR prediction performance compared to traditional linear models.

Deep Learning over Multi-field Categorical Data: A Study on User Response Prediction

This paper addresses the challenge of predicting user responses, a critical task in web applications like online advertising, where understanding the likelihood of users engaging with content is paramount. Traditional approaches often rely on linear models, limited in capturing complex feature interactions inherent to web data, which is often categorical and high-dimensional. This paper proposes two novel deep neural network (DNN) models aimed at improving the efficacy of predictions by efficiently learning interactions within categorical data.

Models and Methodology

The authors introduce Factorisation Machine supported Neural Networks (FNN) and Sampling-based Neural Networks (SNN). The FNN model leverages factorisation machines to reduce the dimension of sparse features to dense continuous ones, facilitating the learning process by capturing feature interactions in a low-rank space. On the other hand, the SNN employs sampling-based Restricted Boltzmann Machines (RBM) or Denoising Autoencoders (DAE) for pre-training, using a negative sampling strategy to handle high-dimensional, multi-field categorical features.

Key Techniques

Factorisation Machines (FM): Utilized in the FNN model to initialize feature embeddings, FM captures interactions through vector inner products, allowing the network to learn meaningful patterns from sparse categorical data.
Sampling-based Pre-training: The SNN model employs a unique approach to pre-training, selecting negative samples for each categorical feature field, which aids in effective weight initialization by reducing computational overhead.
Network Architecture: Both models use a diamond-shaped network architecture discovered through experimentation to offer superior generalization capabilities compared to traditional increasing or decreasing layer sizes.

Experimental Results

In extensive experiments conducted on the iPinYou dataset, the proposed models demonstrate superior performance in predicting click-through rates (CTR) compared to traditional models like Logistic Regression and standalone Factorisation Machines. Notably, the FNN model consistently outperformed, indicating the robustness of FM in capturing latent feature interactions. The SNN with RBM and DAE also showed competitive results, suggesting that unsupervised pre-training effectively enhances model learning.

Implications and Future Directions

The proposed DNN models offer a significant advancement in handling complex, high-dimensional categorical data in web applications. By embedding features in a latent space and leveraging deep architectures, these models can learn and generalize complex data patterns more effectively than shallow models. The potential future developments include optimizing the momentum methods for better curvature handling during training and exploring partial connectivity in layers beyond the initial layer for reducing complexity and enhancing model robustness.

Overall, this paper contributes novel insights into applying deep learning techniques for categorical feature spaces, setting a foundation for further exploration in AI-driven web applications. The methodologies presented have implications for improving ad targeting, personalized recommendations, and beyond, where understanding intricate patterns in user interactions is crucial.

PDF Markdown