Graph Neural Network contextual embedding for Deep Learning on Tabular Data (2303.06455v2)
Abstract: All industries are trying to leverage AI based on their existing big data which is available in so called tabular form, where each record is composed of a number of heterogeneous continuous and categorical columns also known as features. Deep Learning (DL) has constituted a major breakthrough for AI in fields related to human skills like natural language processing, but its applicability to tabular data has been more challenging. More classical Machine Learning (ML) models like tree-based ensemble ones usually perform better. This paper presents a novel DL model using Graph Neural Network (GNN) more specifically Interaction Network (IN), for contextual embedding and modelling interactions among tabular features. Its results outperform those of a recently published survey with DL benchmark based on five public datasets, also achieving competitive results when compared to boosted-tree solutions.
- T. Chen, C. Guestrin, Xgboost: A scalable tree boosting system, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, Association for Computing Machinery, New York, NY, USA, 2016, pp. 785–794.
- Catboost: unbiased boosting with categorical features, Advances in neural information processing systems 31 (2018).
- Lightgbm: A highly efficient gradient boosting decision tree, Advances in neural information processing systems 30 (2017) 3146––3154.
- Bert: Pre-training of deep bidirectional transformers for language understanding, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota, 2019, pp. 4171–4186.
- Tabtransformer: Tabular data modeling using contextual embeddings, arXiv preprint arXiv:2012.06678 (2020).
- Revisiting Deep Learning Models for Tabular Data, in: Advances in Neural Information Processing Systems, volume 34, Curran Associates, Inc., 2021, pp. 18932–18943.
- Saint: Improved neural networks for tabular data via row attention and contrastive pre-training, arXiv preprint arXiv:2106.01342 (2021).
- Deep Neural Networks and Tabular Data: A Survey, IEEE Transactions on Neural Networks and Learning Systems (2022) 1–21.
- Neural collaborative filtering, in: Proceedings of the 26th International Conference on World Wide Web, WWW ’17, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 2017, p. 173–182.
- Deepfm: A factorization-machine based neural network for ctr prediction, in: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, 2017, pp. 1725–1731.
- Wide & deep learning for recommender systems, in: Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, DLRS 2016, Association for Computing Machinery, New York, NY, USA, 2016, pp. 7–10.
- Deep learning recommendation model for personalization and recommendation systems, arXiv preprint arXiv:1906.00091 (2019).
- Dcn v2: Improved deep & cross network and practical lessons for web-scale learning to rank systems, in: Proceedings of the Web Conference 2021, WWW ’21, Association for Computing Machinery, New York, NY, USA, 2021, p. 1785–1797.
- Neural oblivious decision ensembles for deep learning on tabular data, arXiv preprint arXiv:1909.06312 (2019).
- S. . Arik, T. Pfister, Tabnet: Attentive interpretable tabular learning, Proceedings of the AAAI Conference on Artificial Intelligence 35 (2021) 6679–6687.
- Attention is All you Need, in: Advances in Neural Information Processing Systems, volume 30, Curran Associates, Inc., 2017.
- Interaction networks for learning about objects, relations and physics, Advances in Neural Information Processing Systems (2016) 4509–4517.
- Relational inductive biases, deep learning, and graph networks, arXiv preprint arXiv:1806.01261 (2018).
- Learning to simulate complex physics with graph networks, 37th International Conference on Machine Learning, ICML 2020 PartF168147-11 (2020) 8428–8437.
- M. Joseph, H. Raj, Gate: Gated additive tree ensemble for tabular classification and regression, arXiv preprint arXiv:2207.08548 (2022).
- Tabddpm: Modelling tabular data with diffusion models, arXiv preprint arXiv:2209.15421 (2022).
- P. Langley, S. Sage, Oblivious decision trees and abstract cases, in: Working notes of the AAAI-94 workshop on case-based reasoning, Seattle, WA, 1994, pp. 113–117.
- N. Frosst, G. Hinton, Distilling a neural network into a soft decision tree, arXiv preprint arXiv:1711.09784 (2017).
- Sdtr: Soft decision tree regressor for tabular data, IEEE Access 9 (2021) 55999–56011.
- Net-DNF: Effective Deep Modeling of Tabular Data, in: International Conference on Learning Representations, 2021.
- Deepgbm: A deep learning framework distilled by gbdt for online prediction tasks, in: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’19, Association for Computing Machinery, New York, NY, USA, 2019a, p. 384–394.
- Improving language understanding by generative pre-training (2018).
- An image is worth 16x16 words: Transformers for image recognition at scale, in: International Conference on Learning Representations, 2021.
- Graphcast: Learning skillful medium-range global weather forecasting, arXiv preprint arXiv:2212.12794 (2022).
- TabGNN: Multiplex graph neural network for tabular data prediction, in: 3rd Workshop on Deep Learning Practice for High-Dimensional Sparse Data with KDD, 2021.
- Learning Enhanced Representation for Tabular Data via Neighborhood Propagation, in: Advances in Neural Information Processing Systems, volume 35, Curran Associates, Inc., 2022, pp. 16373–16384.
- M. Cvitkovic, Supervised learning on relational databases with graph neural networks, arXiv preprint arXiv:2002.02046 (2020).
- ATJ-Net: Auto-Table-Join Network for Automatic Learning on Relational Databases, in: Proceedings of the Web Conference 2021, WWW ’21, Association for Computing Machinery, New York, NY, USA, 2021, p. 1540–1551.
- W. L. Hamilton, Graph representation learning, Synthesis Lectures on Artificial Intelligence and Machine Learning 14 (2020) 1–159.
- R. K. Pace, R. Barry, Sparse spatial autoregressions, Statistics & Probability Letters 33 (1997) 291–297.
- Searching for exotic particles in high-energy physics with deep learning, Nature communications 5 (2014) 1–9.
- L. Breiman, Random forests, Machine Learning 45 (2001) 5–32.
- W. S. McCulloch, W. Pitts, A logical calculus of the ideas immanent in nervous activity, The bulletin of mathematical biophysics 5 (1943) 115–133.
- I. Shavitt, E. Segal, Regularization learning networks: Deep learning for tabular datasets, in: Advances in Neural Information Processing Systems, volume 31, Curran Associates, Inc., 2018.
- VIME: Extending the Success of Self- and Semi-supervised Learning to Tabular Domain, in: Advances in Neural Information Processing Systems, volume 33, Curran Associates, Inc., 2020, pp. 11033–11043.
- Optuna: A next-generation hyperparameter optimization framework, in: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’19, Association for Computing Machinery, New York, NY, USA, 2019, p. 2623–2631.
- Pytorch: An imperative style, high-performance deep learning library, in: Advances in Neural Information Processing Systems, volume 32, Curran Associates, Inc., 2019.
- M. Fey, J. E. Lenssen, Fast graph representation learning with PyTorch Geometric, in: ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019.
- Ray: A distributed framework for emerging ai applications, in: Proceedings of the 13th USENIX Conference on Operating Systems Design and Implementation, OSDI’18, USENIX Association, USA, 2018, p. 561–577.
- Efficient Transformers: A Survey, ACM Comput. Surv. 55 (2022).
- S. M. Lundberg, S.-I. Lee, A unified approach to interpreting model predictions, in: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, Curran Associates Inc., Red Hook, NY, USA, 2017, p. 4768–4777.