Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ReConTab: Regularized Contrastive Representation Learning for Tabular Data (2310.18541v2)

Published 28 Oct 2023 in cs.LG and cs.AI

Abstract: Representation learning stands as one of the critical machine learning techniques across various domains. Through the acquisition of high-quality features, pre-trained embeddings significantly reduce input space redundancy, benefiting downstream pattern recognition tasks such as classification, regression, or detection. Nonetheless, in the domain of tabular data, feature engineering and selection still heavily rely on manual intervention, leading to time-consuming processes and necessitating domain expertise. In response to this challenge, we introduce ReConTab, a deep automatic representation learning framework with regularized contrastive learning. Agnostic to any type of modeling task, ReConTab constructs an asymmetric autoencoder based on the same raw features from model inputs, producing low-dimensional representative embeddings. Specifically, regularization techniques are applied for raw feature selection. Meanwhile, ReConTab leverages contrastive learning to distill the most pertinent information for downstream tasks. Experiments conducted on extensive real-world datasets substantiate the framework's capacity to yield substantial and robust performance improvements. Furthermore, we empirically demonstrate that pre-trained embeddings can seamlessly integrate as easily adaptable features, enhancing the performance of various traditional methods such as XGBoost and Random Forest.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (83)
  1. Tabnet: Attentive interpretable tabular learning. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 6679–6687, 2021.
  2. Uci machine learning repository, 2007.
  3. Gradient boosting neural networks: Grownet. arXiv preprint arXiv:2002.07971, 2020.
  4. Scarf: Self-supervised contrastive learning using random feature corruption. arXiv preprint arXiv:2106.15147, 2021.
  5. Bayesian regression and classification. Nato Science Series sub Series III Computer And Systems Sciences, 190:267–288, 2003.
  6. Deep neural networks and tabular data: A survey. IEEE Transactions on Neural Networks and Learning Systems, 2022.
  7. Leo Breiman. Random forests. Machine learning, 45:5–32, 2001.
  8. Leo Breiman. Classification and regression trees. Routledge, 2017.
  9. Deep learning for precise robot position prediction in logistics. Journal of Theory and Practice of Engineering Science, 3(10):36–41, Oct. 2023.
  10. Suiyao Chen. Some Recent Advances in Design of Bayesian Binomial Reliability Demonstration Tests. Phd thesis, University of South Florida, 2020.
  11. Personalized fall risk assessment for long-term care services improvement. In 2017 Annual Reliability and Maintainability Symposium (RAMS), pages 1–7. IEEE, 2017.
  12. Claims data-driven modeling of hospital time-to-readmission risk with latent heterogeneity. Health care management science, 22:156–179, 2019.
  13. Multi-state reliability demonstration tests. Quality Engineering, 29(3):431–445, 2017.
  14. A data heterogeneity modeling and quantification approach for field pre-assessment of chloride-induced corrosion in aging infrastructures. Reliability Engineering & System Safety, 171:123–135, 2018.
  15. Optimal binomial reliability demonstration tests design under acceptance decision uncertainty. Quality Engineering, 32(3):492–508, 2020.
  16. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794, 2016.
  17. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
  18. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297, 2020.
  19. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15750–15758, 2021.
  20. L2 regularization for learning kernels. arXiv preprint arXiv:1205.2653, 2012.
  21. Deep unsupervised feature selection. ’ ’, 2019.
  22. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  23. Self-supervised representation learning: Introduction, advances, and challenges. IEEE Signal Processing Magazine, 39(3):42–62, 2022.
  24. Masked autoencoders as spatiotemporal learners. Advances in neural information processing systems, 35:35946–35958, 2022.
  25. Graph adversarial training: Dynamically regularizing based on graph structure. IEEE Transactions on Knowledge and Data Engineering, 33(6):2493–2504, 2019.
  26. Local contrastive feature learning for tabular data. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 3963–3967, 2022.
  27. Revisiting deep learning models for tabular data. Advances in Neural Information Processing Systems, 34:18932–18943, 2021.
  28. Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33:21271–21284, 2020.
  29. Why do tree-based models still outperform deep learning on typical tabular data? Advances in Neural Information Processing Systems, 35:507–520, 2022.
  30. Analysis of the automl challenge series 2015-2018. In AutoML, Springer series on Challenges in Machine Learning, 2019.
  31. Dimensionality reduction by learning an invariant mapping. In 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), volume 2, pages 1735–1742. IEEE, 2006.
  32. Generalized linear models. In Statistical models in S, pages 195–247. Routledge, 2017.
  33. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16000–16009, 2022.
  34. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020.
  35. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  36. Tabtransformer: Tabular data modeling using contextual embeddings. arXiv preprint arXiv:2012.06678, 2020.
  37. IBM. Telco customer churn (11.1.3+), 2019.
  38. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, pages 448–456. pmlr, 2015.
  39. Masked autoencoders in 3d point cloud representation learning. arXiv preprint arXiv:2207.01545, 2022.
  40. Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, 30, 2017.
  41. Deepgbm: A deep learning framework distilled by gbdt for online prediction tasks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 384–394, 2019.
  42. Tabnn: A universal neural network solution for tabular data. ’ ’, 2018.
  43. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  44. Self-normalizing neural networks. Advances in neural information processing systems, 30, 2017.
  45. Revisiting self-supervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1920–1929, 2019.
  46. Semi-supervised zero-shot classification with label representation learning. In Proceedings of the IEEE international conference on computer vision, pages 4211–4219, 2015.
  47. Machine learning in agriculture: A review. Sensors, 18(8):2674, 2018.
  48. Isolation forest. In 2008 eighth ieee international conference on data mining, pages 413–422. IEEE, 2008.
  49. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
  50. The group lasso for logistic regression. Journal of the Royal Statistical Society Series B: Statistical Methodology, 70(1):53–71, 2008.
  51. Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE transactions on pattern analysis and machine intelligence, 41(8):1979–1993, 2018.
  52. A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 62:22–31, 2014.
  53. Neural oblivious decision ensembles for deep learning on tabular data. arXiv preprint arXiv:1909.06312, 2019.
  54. A comparative study of categorical variable encoding techniques for neural network classifiers. International journal of computer applications, 175(4):7–9, 2017.
  55. Lutz Prechelt. Early stopping-but when? In Neural Networks: Tricks of the trade, pages 55–69. Springer, 2002.
  56. Catboost: unbiased boosting with categorical features. Advances in neural information processing systems, 31, 2018.
  57. Secure and robust machine learning for healthcare: A survey. IEEE Reviews in Biomedical Engineering, 14:156–180, 2020.
  58. Carl Edward Rasmussen. Gaussian processes in machine learning. In Summer school on machine learning, pages 63–71. Springer, 2003.
  59. Scale-mae: A scale-aware masked autoencoder for multiscale geospatial representation learning. arXiv preprint arXiv:2212.14532, 2022.
  60. Real-time prediction of online shoppers’ purchasing intention using multilayer perceptron and lstm recurrent neural networks. Neural Computing and Applications, 31:6893–6908, 2019.
  61. Saint: Improved neural networks for tabular data via row attention and contrastive pre-training. arXiv preprint arXiv:2106.01342, 2021.
  62. Autoint: Automatic feature interaction learning via self-attentive neural networks. In Proceedings of the 28th ACM international conference on information and knowledge management, pages 1161–1170, 2019.
  63. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1):1929–1958, 2014.
  64. Optimizing crop management with reinforcement learning and imitation learning. arXiv preprint arXiv:2209.09991, 2022.
  65. Supervised contrastive learning with tpe-based bayesian optimization of tabular data for imbalanced learning. arXiv preprint arXiv:2210.10824, 2022.
  66. Optimal test design for reliability demonstration under multi-stage acceptance uncertainties. Quality Engineering, 0(0):1–14, 2023.
  67. Sensitivity and uncertainty analysis of trace physical model parameters based on psbt benchmark using gaussian process emulator. Proc. 17th Int. Topl. Mtg. Nuclear Reactor Thermal Hydraulics (NURETH-17), pages 3–8, 2017.
  68. Surrogate-based bayesian calibration of thermal-hydraulics models based on psbt time-dependent benchmark data. In Proc. ANS Best Estimate Plus Uncertainty International Conference, Real Collegio, Lucca, Italy, 2018.
  69. Gaussian process–based inverse uncertainty quantification for trace physical model parameters using steady-state psbt benchmark. Nuclear Science and Engineering, 193(1-2):100–114, 2019.
  70. Inverse uncertainty quantification by hierarchical bayesian inference for trace physical model parameters based on bfbt benchmark. Proceedings of NURETH-2019, Portland, Oregon, USA, 2019.
  71. Inverse uncertainty quantification by hierarchical bayesian modeling and application in nuclear system thermal-hydraulics codes. arXiv preprint arXiv:2305.16622, 2023.
  72. Dcn v2: Improved deep & cross network and practical lessons for web-scale learning to rank systems. In Proceedings of the web conference 2021, pages 1785–1797, 2021.
  73. Raymond E Wright. Logistic regression. ’ ’, 1995.
  74. Hallucination improves the performance of unsupervised visual representation learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 16132–16143, 2023.
  75. Genco: An auxiliary generator from contrastive learning for enhanced few-shot learning in remote sensing. arXiv preprint arXiv:2307.14612, 2023.
  76. Extended agriculture-vision: An extension of a large aerial image dataset for agricultural pattern analysis. arXiv preprint arXiv:2303.02460, 2023.
  77. Optimizing nitrogen management with deep reinforcement learning and crop simulations. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1712–1720, 2022.
  78. Contrastive learning enhanced deep neural network with serial regularization for high-dimensional tabular data. Expert Systems with Applications, 228:120243, 2023.
  79. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.
  80. Importance of sporopollenin structure and accessibility in the sorption of phenanthrene by biota spores and pollens. Environmental science & technology, 53(24):14285–14295, 2019.
  81. Tabert: Pretraining for joint understanding of textual and tabular data. arXiv preprint arXiv:2005.08314, 2020.
  82. Vime: Extending the success of self-and semi-supervised learning to tabular domain. Advances in Neural Information Processing Systems, 33:11033–11043, 2020.
  83. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B: Statistical Methodology, 67(2):301–320, 2005.
Citations (27)

Summary

We haven't generated a summary for this paper yet.