An Initialization Schema for Neuronal Networks on Tabular Data (2311.03996v2)
Abstract: Nowadays, many modern applications require heterogeneous tabular data, which is still a challenging task in terms of regression and classification. Many approaches have been proposed to adapt neural networks for this task, but still, boosting and bagging of decision trees are the best-performing methods for this task. In this paper, we show that a binomial initialized neural network can be used effectively on tabular data. The proposed approach shows a simple but effective approach for initializing the first hidden layer in neural networks. We also show that this initializing schema can be used to jointly train ensembles by adding gradient masking to batch entries and using the binomial initialization for the last layer in a neural network. For this purpose, we modified the hinge binary loss and the soft max loss to make them applicable for joint ensemble training. We evaluate our approach on multiple public datasets and showcase the improved performance compared to other neural network-based approaches. In addition, we discuss the limitations and possible further research of our approach for improving the applicability of neural networks to tabular data. Link: https://es-cloud.cs.uni-tuebingen.de/d/8e2ab8c3fdd444e1a135/?p=%2FInitializationNeuronalNetworksTabularData&mode=list
- Watermarking techniques for medical data authentication: a survey. Multimedia Tools and Applications, 80:30165–30197, 2021.
- Tabnet: Attentive interpretable tabular learning. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 6679–6687, 2021.
- Dana H Ballard. Modular learning in neural networks. In Proceedings of the sixth National Conference on artificial intelligence-volume 1, pages 279–284, 1987.
- Deep neural networks and tabular data: A survey. IEEE Transactions on Neural Networks and Learning Systems, 2022.
- Leo Breiman. Bagging predictors. Machine learning, 24:123–140, 1996.
- Dataset search: a survey. The VLDB Journal, 29(1):251–272, 2020.
- Learning to explain: An information-theoretic perspective on model interpretation. In International conference on machine learning, pages 883–892. PMLR, 2018.
- Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794, 2016.
- FICO Community. Fico: Home equity line of credit dataset, Accessed on july 2023.
- Good semi-supervised learning that requires a bad gan. Advances in neural information processing systems, 30, 2017.
- Linear programming boosting via column generation. Machine Learning, 46:225–254, 2002.
- D. Dua and C. Graff. Uci machine learning repository, Accessed on july 2023.
- Yoav Freund. A more robust boosting algorithm. arXiv preprint arXiv:0905.2138, 2009.
- Linear hinge loss and average margin. Advances in neural information processing systems, 11, 1998.
- Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pages 249–256. JMLR Workshop and Conference Proceedings, 2010.
- Deep learning. MIT press, 2016.
- Feature selection with decision tree criterion. In Fifth International Conference on Hybrid Intelligent Systems (HIS’05), pages 6–pp. IEEE, 2005.
- An introduction to variable and feature selection. Journal of machine learning research, 3(Mar):1157–1182, 2003.
- Tin Kam Ho. The random subspace method for constructing decision forests. IEEE transactions on pattern analysis and machine intelligence, 20(8):832–844, 1998.
- Deep neural network initialization with decision trees. IEEE transactions on neural networks and learning systems, 30(5):1286–1295, 2018.
- Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, 30, 2017.
- Tabnn: A universal neural network solution for tabular data. 2018.
- Davis E King. Dlib-ml: A machine learning toolkit. The Journal of Machine Learning Research, 10:1755–1758, 2009.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Deep neural decision forests. In Proceedings of the IEEE international conference on computer vision, pages 1467–1475, 2015.
- Backpropagation applied to handwritten zip code recognition. Neural computation, 1(4):541–551, 1989.
- Exploiting audio-visual consistency with partial supervision for spatial audio generation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 2056–2063, 2021.
- Consistent individualized feature attribution for tree ensembles. arXiv preprint arXiv:1802.03888, 2018.
- Listening to the world improves speech command recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
- A survey on deep learning for financial risk prediction. Quantitative Finance and Economics, 5(4):716–737, 2021.
- Catboost: unbiased boosting with categorical features. Advances in neural information processing systems, 31, 2018.
- Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
- U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
- Frank Rosenblatt. The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65(6):386, 1958.
- Rusboost: Improving classification performance when training data is skewed. In 2008 19th international conference on pattern recognition, pages 1–4. IEEE, 2008.
- Subtask gated networks for non-intrusive load monitoring. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 1150–1157, 2019.
- Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25, 2012.
- Saint: Improved neural networks for tabular data via row attention and contrastive pre-training. arXiv preprint arXiv:2106.01342, 2021.
- Explainable extreme gradient boosting tree-based prediction of toluene, ethylbenzene and xylene wet deposition. Science of The Total Environment, 653:140–147, 2019.
- Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI conference on artificial intelligence, volume 31, 2017.
- Adaptive neural trees. In International Conference on Machine Learning, pages 6166–6175. PMLR, 2019.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001, volume 1, pages I–I. Ieee, 2001.
- Using a random forest to inspire a neural network and improving on it. In Proceedings of the 2017 SIAM international conference on data mining, pages 1–9. SIAM, 2017.
- Totally corrective boosting algorithms that maximize the margin. In Proceedings of the 23rd international conference on Machine learning, pages 1001–1008, 2006.
- Deep neural decision trees. arXiv preprint arXiv:1806.06988, 2018.
- Graph convolutional networks for text classification. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 7370–7377, 2019.
- Invase: Instance-wise variable selection using neural networks. In International Conference on Learning Representations, 2018.
- Retgen: A joint framework for retrieval and grounded text generation modeling. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 11739–11747, 2022.
- Improving end-to-end speech translation by leveraging auxiliary speech and text data. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 13984–13992, 2023.