Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MambaTab: A Plug-and-Play Model for Learning Tabular Data (2401.08867v2)

Published 16 Jan 2024 in cs.LG

Abstract: Despite the prevalence of images and texts in machine learning, tabular data remains widely used across various domains. Existing deep learning models, such as convolutional neural networks and transformers, perform well however demand extensive preprocessing and tuning limiting accessibility and scalability. This work introduces an innovative approach based on a structured state-space model (SSM), MambaTab, for tabular data. SSMs have strong capabilities for efficiently extracting effective representations from data with long-range dependencies. MambaTab leverages Mamba, an emerging SSM variant, for end-to-end supervised learning on tables. Compared to state-of-the-art baselines, MambaTab delivers superior performance while requiring significantly fewer parameters, as empirically validated on diverse benchmark datasets. MambaTab's efficiency, scalability, generalizability, and predictive gains signify it as a lightweight, "plug-and-play" solution for diverse tabular data with promise for enabling wider practical applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. Abien Fred Agarap. Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375, 2018.
  2. Tabnet: Attentive interpretable tabular learning. Proceedings of the AAAI Conference on Artificial Intelligence, 35(8):6679–6687, May 2021.
  3. Layer normalization. arXiv preprint arXiv:1607.06450, 2016.
  4. Scarf: Self-supervised contrastive learning using random feature corruption. arXiv preprint arXiv:2106.15147, 2021.
  5. Xgboost: A scalable tree boosting system. Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 22:785–794, 2016.
  6. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Networks, 107:3–11, 2018. Special issue on deep reinforcement learning.
  7. Hungry hungry hippos: Towards language modeling with state space models. In International Conference on Learning Representations, 2022.
  8. Revisiting deep learning models for tabular data. Advances in Neural Information Processing Systems, 34:18932–18943, 2021.
  9. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2023.
  10. Efficiently modeling long sequences with structured state spaces. In International Conference on Learning Representations, 2021.
  11. Combining recurrent, convolutional, and continuous-time models with linear state space layers. Advances in Neural Information Processing Systems, 34:572–585, 2021.
  12. Deep residual learning for image recognition. IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.
  13. Tabtransformer: Tabular data modeling using contextual embeddings. arXiv preprint arXiv:2012.06678, 2020.
  14. Batch normalization: Accelerating deep network training by reducing internal covariate shift. International Conference on Machine Learning, pages 448–456, 2015.
  15. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  16. Self-normalizing neural networks. Advances in Neural Information Processing Systems, 30:972– 981, 2017.
  17. Autoint: Automatic feature interaction learning via self-attentive neural networks. ACM International Conference on Information and Knowledge Management, pages 1161–1170, 2019.
  18. Attention is all you need. Advances in Neural Information Processing Systems, 30, 2017.
  19. Transtab: Learning transferable tabular transformers across tables. Advances in Neural Information Processing Systems, 35:2902–2915, 2022.
  20. Deep & cross network for ad click predictions. Proceedings of the ADKDD’17, 2017.
  21. Vime: Extending the success of self-and semi-supervised learning to tabular domain. Advances in Neural Information Processing Systems, 33:11033–11043, 2020.
  22. Customer transaction fraud detection using xgboost model. International Conference on Computer Engineering and Application (ICCEA), pages 554–558, 2020.
Citations (9)

Summary

  • The paper introduces MambaTab, a plug-and-play model for tabular data that reduces parameter count by over 99% compared to transformer-based methods.
  • The paper demonstrates MambaTab's efficacy in both vanilla supervised learning and feature incremental learning through extensive benchmarking across eight datasets.
  • The paper highlights MambaTab's low preprocessing requirements and scalability, offering a robust solution for dynamically evolving datasets across various domains.

Introduction

In the landscape of machine learning, tabular data persists as the central format across various domains: industrial, healthcare, academic, among others. Deep learning models, including CNNs and transformers, have been extensively adopted for tabular data, leading to remarkable performances. Nonetheless, these techniques entail significant computational resources, extensive preprocessing, and hyperparameter tuning, creating accessibility and scalability constraints. Existing solutions also typically fall short when confronted with feature incremental learning (FIL)—a scenario where features within the dataset increase over time. This highlights the demand for innovative solutions that support the continuous evolution of datasets.

State-of-the-Art and Motivation

The paper contextualizes the challenge by surveying the present solutions, which fall into different categories: classical machine learning models, deep learning approaches based on CNNs and transformers, and more recently, the adoption of self-supervised learning strategies. Deep learning models like TabNet, AutoInt, and TabTransformer represent the most recent advancements for tabular data, leveraging attentions and embeddings to manage categorical and numerical features. Yet, the divergence from simpler models towards these complex architectures exacerbates the need for extensive tuning and data manipulation. Notably, almost all current methods operate under vanilla supervised learning, with limited capacity to handle FIL. The research necessitates an architecture that can operate efficiently in dynamic feature environments without retraining from scratch when new data enters the scene.

Novel Approach: MambaTab

The authors propose a novel solution: MambaTab, based on structured state-space models, specifically exploiting the Mamba SSM variant for handling tabular data in an end-to-end supervised learning setting. MambaTab stands out due to its parameter efficiency, low preprocessing requirements, and innate support for FIL. The ability of the Mamba framework to deal with long-range dependencies and its linear scalability sets MambaTab apart from the conventional deep learning models, notably reducing the number of parameters, typically by more than 99%, in comparison to transformer-based solutions. The paper's empirical evaluation across various benchmark datasets illustrates MambaTab's superior performance, evidencing its capability as a lightweight and adaptable methodology for practitioners dealing with tabular data.

Benchmarking MambaTab

In an extensive empirical paper spanning eight public datasets and two different learning contexts—vanilla supervised learning and FIL—MambaTab consistently outstripped state-of-the-art baselines. For vanilla supervised learning, MambaTab provided superior or competitive performances across datasets while utilizing a fraction of the parameters required by other models. In the domain of FIL, the methodology demonstrated a seamless adaptation without the need for complex restructuring or significant parameter tuning.

Conclusion and Future Work

MambaTab is set forth as a transformative approach for tabular data, offering not only reduced complexity but also an 'out-of-the-box' solution for environments where datasets continually evolve. Its ability to deliver across a spectrum of domains and dataset structures without the encumberment of labor-intensive preprocessing establishes it as a robust candidate for broad applications. Looking ahead, the authors aspire to extend their work into regression tasks, further broadening the scope of MambaTab's utility. Through continued refinement and extension, MambaTab offers the potential to mitigate current challenges and propel the next wave of machine learning for tabular data.