Papers
Topics
Authors
Recent
Search
2000 character limit reached

Tabular Data: Deep Learning is Not All You Need

Published 6 Jun 2021 in cs.LG | (2106.03253v2)

Abstract: A key element in solving real-life data science problems is selecting the types of models to use. Tree ensemble models (such as XGBoost) are usually recommended for classification and regression problems with tabular data. However, several deep learning models for tabular data have recently been proposed, claiming to outperform XGBoost for some use cases. This paper explores whether these deep models should be a recommended option for tabular data by rigorously comparing the new deep models to XGBoost on various datasets. In addition to systematically comparing their performance, we consider the tuning and computation they require. Our study shows that XGBoost outperforms these deep models across the datasets, including the datasets used in the papers that proposed the deep models. We also demonstrate that XGBoost requires much less tuning. On the positive side, we show that an ensemble of deep models and XGBoost performs better on these datasets than XGBoost alone.

Citations (1,002)

Summary

  • The paper rigorously demonstrates that traditional tree ensembles, particularly XGBoost, outperform deep learning models on most tabular datasets.
  • It employs standardized Bayesian hyperparameter tuning across 11 diverse datasets to ensure fair and reproducible comparisons.
  • The findings emphasize XGBoost's efficiency and suggest a need for developing deep learning architectures tailored to the complexities of tabular data.

Tabular Data: Deep Learning is Not All You Need

The paper "Tabular Data: Deep Learning is Not All You Need," authored by Ravid Shwartz-Ziv and Amitai Armon, provides a rigorous examination of the efficacy of deep learning models compared to traditional tree ensemble methods, particularly XGBoost, for tabular data. This detailed investigation is crucial since tabular data is ubiquitous in real-world applications such as finance, healthcare, and manufacturing.

Background

Deep learning models have significantly advanced multiple domains, including image, text, and speech processing, through canonical architectures like CNNs, RNNs, and Transformers. However, tabular data, characterized by its format of rows (samples) and columns (features), remains predominantly managed by traditional machine learning algorithms like gradient-boosted decision trees (GBDT), specifically XGBoost, LightGBM, and CatBoost. Tree ensembles are favored due to their superior performance, transparency, and ease of tuning.

Despite the rise of several deep learning architectures claiming superiority over GBDT in tabular data tasks, the paper systematically challenges these claims. It scrutinizes the performance of several recent deep models, such as TabNet, NODE, DNF-Net, and 1D-CNN, against that of XGBoost across diverse datasets, considering accuracy, tuning time, and computational resources.

Experimental Setup

The study evaluated the models using a diverse set of 11 tabular datasets, each with various feature dimensions and sample sizes. The datasets were selected from sources like OpenML, the UCI Machine Learning Repository, and Kaggle, addressing both classification and regression tasks. The authors utilized Bayesian optimization via the HyperOpt library for hyperparameter tuning, ensuring a fair comparison by following similar training and preprocessing steps as outlined in the respective original papers of the deep models.

Results Overview

The key findings are summarized as follows:

  • Performance Comparison: XGBoost consistently outperformed the deep models on most datasets, including those not originally used to design these deep models. For 8 out of the 11 datasets, XGBoost showed superior performance with statistically significant results (p < 0.005).
  • Generalization: Each deep model performed best primarily on the datasets showcased in its original paper, indicating a possible selection bias. When tested on previously unseen datasets, the performance declined notably, undermining the claimed generalizability.
  • Computational Efficiency: XGBoost required significantly fewer iterations to reach optimal performance during hyperparameter tuning, emphasizing its robustness and efficiency. Deep models demanded extensive tuning time and computational resources, with a relatively slower convergence rate.
  • Ensemble Methods: Combining deep models and XGBoost into an ensemble improved overall predictive performance. Specifically, an ensemble of XGBoost plus deep models outperformed individual models and a pure deep ensemble, yielding the lowest average relative performance deterioration (2.32%).

Discussion and Implications

The paper provides a cohesive narrative that despite the advancements in deep learning, traditional tree-based methods, and particularly XGBoost, remain highly competitive for tabular data. The results underscore the limitations of deep learning models in handling the heterogeneous nature of tabular features and missing values, which are well-managed by tree ensembles.

The implications are twofold:

  1. Practical: For practitioners in data science and applied machine learning, the findings suggest prioritizing tree ensembles like XGBoost for tabular data tasks, given their superior performance, efficiency, and ease of tuning.
  2. Theoretical: The study prompts the need for the deep learning research community to develop new architectures or optimization strategies that can handle the idiosyncrasies of tabular data more effectively. It also calls for standardized benchmarks for evaluating tabular data models to avoid biased performance claims.

Future Developments

Future research should focus on:

  • Developing deep models specifically tailored to address the structural and domain-specific challenges of tabular data.
  • Creating standardized benchmark datasets and evaluation protocols to facilitate a fair and reproducible comparison of model performance.
  • Investigating hybrid models that can leverage the strengths of both tree ensembles and deep learning for improved accuracy and interpretability.

In conclusion, while deep learning models hold promise, this paper highlights that for now, tree ensemble methods like XGBoost are indispensable for tackling tabular data due to their robustness, performance, and practical utility. This study serves as a crucial checkpoint, guiding the future trajectory of research and application in machine learning for tabular datasets.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 10 tweets with 1281 likes about this paper.