HyperFast: Instant Classification for Tabular Data (2402.14335v1)
Abstract: Training deep learning models and performing hyperparameter tuning can be computationally demanding and time-consuming. Meanwhile, traditional machine learning methods like gradient-boosting algorithms remain the preferred choice for most tabular data applications, while neural network alternatives require extensive hyperparameter tuning or work only in toy datasets under limited settings. In this paper, we introduce HyperFast, a meta-trained hypernetwork designed for instant classification of tabular data in a single forward pass. HyperFast generates a task-specific neural network tailored to an unseen dataset that can be directly used for classification inference, removing the need for training a model. We report extensive experiments with OpenML and genomic data, comparing HyperFast to competing tabular data neural networks, traditional ML methods, AutoML systems, and boosting machines. HyperFast shows highly competitive results, while being significantly faster. Additionally, our approach demonstrates robust adaptability across a variety of classification tasks with little to no fine-tuning, positioning HyperFast as a strong solution for numerous applications and rapid model deployment. HyperFast introduces a promising paradigm for fast classification, with the potential to substantially decrease the computational burden of deep learning. Our code, which offers a scikit-learn-like interface, along with the trained HyperFast model, can be found at https://github.com/AI-sandbox/HyperFast.
- TabNet: Attentive Interpretable Tabular Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 35(8): 6679–6687.
- Aronszajn, N. 1950. Theory of reproducing kernels. Transactions of the American mathematical society, 68(3): 337–404.
- NeRN – Learning Neural Representations for Neural Networks.
- Predicting Dog Phenotypes from Genotypes. In 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), 3558–3562.
- Hyperopt: a python library for model selection and hyperparameter optimization. Computational Science & Discovery, 8(1): 014008.
- OpenML Benchmarking Suites. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2).
- Machine Learning Strategies for Improved Phenotype Prediction in Underrepresented Populations. In Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, volume 29, 404–418.
- Deep Neural Networks and Tabular Data: A Survey. CoRR, abs/2110.01889.
- TabCaps: A Capsule Neural Network for Tabular Data Classification with BoW Routing. In International Conference on Learning Representations.
- Danets: Deep abstract networks for tabular data classification and regression. Proceedings of the AAAI Conference on Artificial Intelligence, 36(4): 3930–3938.
- ExcelFormer: A Neural Network Surpassing GBDTs on Tabular Data. arXiv preprint arXiv:2301.02819.
- XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, 785–794. New York, NY, USA: Association for Computing Machinery. ISBN 9781450342322.
- Kernel Methods for Deep Learning. In Bengio, Y.; Schuurmans, D.; Lafferty, J.; Williams, C.; and Culotta, A., eds., Advances in Neural Information Processing Systems, volume 22. Curran Associates, Inc.
- Consortium, I. H. .; et al. 2010. Integrating common and rare genetic variation in diverse human populations. Nature, 467(7311): 52.
- Applications and techniques for fast machine learning in science. Frontiers in big Data, 5: 787421.
- Deutsch, L. 2018. Generating neural networks with neural networks. arXiv preprint arXiv:1801.01952.
- Pattern Classification (2nd Edition). USA: Wiley-Interscience. ISBN 0471056693.
- Autogluon-tabular: Robust and accurate automl for structured data. arXiv preprint arXiv:2003.06505.
- A guide to deep learning in healthcare. Nature medicine, 25(1): 24–29.
- Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning. arXiv:2007.04074 [cs.LG].
- Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning, 1126–1135. PMLR.
- Conditional neural processes. In International Conference on Machine Learning, 1704–1713. PMLR.
- Neural processes. ICML Workshop on Theoretical Foundations and Applications of Deep Generative Models.
- Dynamic few-shot visual learning without forgetting. In Proceedings of the IEEE conference on computer vision and pattern recognition, 4367–4375.
- Revisiting deep learning models for tabular data. Advances in Neural Information Processing Systems, 34: 18932–18943.
- Why do tree-based models still outperform deep learning on typical tabular data? In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track.
- HyperNetworks. In ICLR.
- Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, 1026–1034.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
- AutoML: A survey of the state-of-the-art. Knowledge-Based Systems, 212: 106622.
- TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second. In The Eleventh International Conference on Learning Representations.
- Well-tuned Simple Nets Excel on Tabular Datasets. In Ranzato, M.; Beygelzimer, A.; Dauphin, Y.; Liang, P.; and Vaughan, J. W., eds., Advances in Neural Information Processing Systems, volume 34, 23928–23941. Curran Associates, Inc.
- Net-dnf: Effective deep modeling of tabular data. In International Conference on Learning Representations.
- LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Guyon, I.; Luxburg, U. V.; Bengio, S.; Wallach, H.; Fergus, R.; Vishwanathan, S.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
- Hyperopt-sklearn: automatic hyperparameter configuration for scikit-learn. In ICML workshop on AutoML, volume 9, 50. Citeseer Austin, TX.
- Self-attention between datapoints: Going beyond individual input-output pairs in deep learning. Advances in Neural Information Processing Systems, 34: 28742–28756.
- Bayesian Hypernetworks.
- Randomized nonlinear component analysis. In International conference on machine learning, 1359–1367. PMLR.
- Decoupled Weight Decay Regularization. In International Conference on Learning Representations.
- Multiplicative Normalizing Flows for Variational Bayesian Neural Networks. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, 2218–2227. JMLR.org.
- When Do Neural Nets Outperform Boosted Trees on Tabular Data? arXiv preprint arXiv:2305.02997.
- NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In ECCV.
- Genes mirror geography within Europe. Nature, 456(7218): 98–101.
- Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12: 2825–2830.
- Neural oblivious decision ensembles for deep learning on tabular data. arXiv preprint arXiv:1909.06312.
- CatBoost: unbiased boosting with categorical features. In Bengio, S.; Wallach, H.; Larochelle, H.; Grauman, K.; Cesa-Bianchi, N.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc.
- A fast and scalable framework for large-scale and ultrahigh-dimensional sparse regression with application to the UK Biobank. PLoS genetics, 16(10): e1009141.
- Few-shot image recognition by predicting parameters from activations. In Proceedings of the IEEE conference on computer vision and pattern recognition, 7229–7238.
- Random features for large-scale kernel machines. Advances in neural information processing systems, 20.
- HyperGAN: A Generative Model for Diverse, Performant Neural Networks. In Chaudhuri, K.; and Salakhutdinov, R., eds., Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, 5361–5369. PMLR.
- Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights. In Oh, A. H.; Agarwal, A.; Belgrave, D.; and Cho, K., eds., Advances in Neural Information Processing Systems.
- Tabular Data: Deep Learning is Not All You Need. Inf. Fusion, 81(C): 84–90.
- Prototypical networks for few-shot learning. Advances in neural information processing systems, 30.
- SAINT: Improved neural networks for tabular data via row attention and contrastive pre-training. arXiv preprint arXiv:2106.01342.
- Approximate kernel PCA using random features: Computational vs. statistical trade-off. arXiv preprint arXiv:1706.06296.
- A hypercube-based encoding for evolving large-scale neural networks. Artificial life, 15(2): 185–212.
- UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS medicine, 12(3): e1001779.
- Significant sparse polygenic risk scores across 813 traits in UK Biobank. PLoS Genetics, 18(3): e1010105.
- Matching networks for one shot learning. Advances in neural information processing systems, 29.
- T2g-former: organizing tabular features into relation graphs promotes heterogeneous feature interaction. Proceedings of the AAAI Conference on Artificial Intelligence, 37(9): 10720–10728.
- Federated learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 13(3): 1–207.
- Deep sets. Advances in neural information processing systems, 30.
- Generative Table Pre-training Empowers Models for Tabular Prediction. arXiv preprint arXiv:2305.09696.
- Hypertransformer: Model generation for supervised and semi-supervised few-shot learning. In ICML, 27075–27098.
- XTab: Cross-table Pretraining for Tabular Transformers. arXiv preprint arXiv:2305.06090.