Applying Deep Learning To Airbnb Search (1810.09591v2)

Published 22 Oct 2018 in cs.LG, cs.AI, cs.IR, and stat.ML

Abstract: The application to search ranking is one of the biggest machine learning success stories at Airbnb. Much of the initial gains were driven by a gradient boosted decision tree model. The gains, however, plateaued over time. This paper discusses the work done in applying neural networks in an attempt to break out of that plateau. We present our perspective not with the intention of pushing the frontier of new modeling techniques. Instead, ours is a story of the elements we found useful in applying neural networks to a real life product. Deep learning was steep learning for us. To other teams embarking on similar journeys, we hope an account of our struggles and triumphs will provide some useful pointers. Bon voyage!

Citations (81)

View on Semantic Scholar

Summary

The paper details Airbnb's successful transition from gradient boosted decision trees to deep neural networks for search ranking, outlining the iterative model evolution process.
A Deep Neural Network model significantly improved normalized discounted cumulative gain and booking metrics, effectively closing the generalization gap compared to prior models.
The transition underscored the importance of feature normalization and embedding high-cardinality data, alongside system engineering for production scalability.

Application of Deep Learning in Enhancing Airbnb's Search Ranking

This essay provides a detailed analysis of the paper "Applying Deep Learning To Airbnb Search," authored by a team from Airbnb Inc., which discusses transitioning their search ranking models to neural networks (NNs) from gradient boosted decision trees (GBDTs). The transition aimed to overcome the saturation point reached by GBDTs in terms of booking improvements and converting search ranking tasks using deep learning techniques into a practical, scalable solution.

Overview of Model Evolution

The paper traces the development path encompassing several neural network-based models that gradually replaced the GBDT model. It started with a simple NN with minimal architecture, which, although unsuccessful in improving bookings, validated the NN pipeline's readiness for production. The journey advanced through multiple iterations, including a Lambdarank NN tailored for optimized normalized discounted cumulative gain (NDCG), a mixed Decision Tree/Factorization Machine NN, and ultimately a Deep Neural Network (DNN) which achieved notable success in closing the generalization gap with over 1.7 billion training pairs, outperforming previous models in terms of NDCG and booking metrics.

Key Technical Highlights and Failed Attempts

Two failed attempts are particularly noteworthy: the integration of listing IDs as a feature failed due to overfitting caused by sparse data; similarly, a multi-task learning approach combining booking predictions and long views proved ineffective as the orthogonal component of long views complicated efforts to improve bookings.

Feature Engineering

The paper innovatively approaches feature engineering by emphasizing feature normalization and distribution smoothness over traditional transformations. This perspective is crucial since neural networks require inputs within certain normalized ranges for effective training. High cardinality categorical features are utilized by embedding neighborhood-cell hashed identifiers, allowing the NN to learn location preferences effectively. This technique provided a simple yet powerful alternative to convoluted feature engineering pipelines, significantly simplifying computations and improving predictive performance.

System Engineering and Optimization

Adopting Protobufs and refactoring static features significantly boosted computational efficiency, thus speeding up training processes. A custom Java library was developed to expedite model scoring in production environments, addressing latency challenges believably.

Hyperparameter Exploration

While variations in hyperparameters such as dropout rates, learning rate, and batch sizes were explored, the paper concludes that systems like Adam optimally serve the learning rates and initialization needs for Airbnb’s specific application.

Feature Importance and Interpretability Challenges

The transition to neural networks presented challenges in understanding model interpretability, specifically related to feature importance. The paper describes various failed attempts using permutation tests and ablation studies but offers a promising direction with the "TopBot Analysis," which contrasts ranked listings feature distribution to provide insights for further model analysis.

Retrospective

The retrospective section captures the essence of the journey—from an initial misconception of deep learning as a direct GBDT replacement to its transformative influence in broadening problem-solving scopes and capability enhancements. The collective accomplishment of integrating deep learning reflects a strategic shift, enabling them to pursue deeper analysis beyond simplified feature engineering.

Implications and Future Directions

This paper contributes both practically and theoretically to the field of machine learning in e-commerce. By refining search ranking models with deep learning techniques, Airbnb has set a benchmark, implying prospective paths for similar marketplaces. The paper’s insights and methodologies offer significant points of speculation for future AI developments, presenting opportunities, not just for refined algorithmic strategies, but for a profound systemic evolution facilitating an efficient scaling of operations.

In conclusion, Airbnb's adaptive innovation in applying deep learning showcases a methodological blueprint for evolving from traditional models to modern neural architectures, advancing the sophistication of search rankings and booking systems within digital commerce frameworks.

Related Papers

Learning To Rank Diversely At Airbnb (2022)
Managing Diversity in Airbnb Search (2020)
Improving Deep Learning For Airbnb Search (2020)
A Deep Look into Neural Ranking Models for Information Retrieval (2019)
Learning to Rank for Maps at Airbnb (2024)

YouTube

Show All Videos