Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 147 tok/s
Gemini 2.5 Pro 40 tok/s Pro
GPT-5 Medium 28 tok/s Pro
GPT-5 High 24 tok/s Pro
GPT-4o 58 tok/s Pro
Kimi K2 201 tok/s Pro
GPT OSS 120B 434 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

TFWT: Tabular Feature Weighting with Transformer (2405.08403v2)

Published 14 May 2024 in cs.LG

Abstract: In this paper, we propose a novel feature weighting method to address the limitation of existing feature processing methods for tabular data. Typically the existing methods assume equal importance across all samples and features in one dataset. This simplified processing methods overlook the unique contributions of each feature, and thus may miss important feature information. As a result, it leads to suboptimal performance in complex datasets with rich features. To address this problem, we introduce Tabular Feature Weighting with Transformer, a novel feature weighting approach for tabular data. Our method adopts Transformer to capture complex feature dependencies and contextually assign appropriate weights to discrete and continuous features. Besides, we employ a reinforcement learning strategy to further fine-tune the weighting process. Our extensive experimental results across various real-world datasets and diverse downstream tasks show the effectiveness of TFWT and highlight the potential for enhancing feature weighting in tabular data analysis.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. The weighted bootstrap, volume 98. Springer Science & Business Media, 1995.
  2. Christopher M Bishop. Neural networks for pattern recognition. Oxford university press, 1995.
  3. R. Bock. MAGIC Gamma Telescope. UCI Machine Learning Repository, 2007.
  4. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  5. Improving minority class prediction using case-specific feature weights. 1997.
  6. Nearest neighbor classification of categorical data by attributes weighting. Expert Systems with Applications, 42(6):3142–3149, 2015.
  7. A feature weighted support vector machine and k-nearest neighbor algorithm for stock market indices prediction. Expert Systems with Applications, 80:340–355, 2017.
  8. Feature weighting in dbscan using reverse nearest neighbours. Pattern Recognition, 137:109314, 2023.
  9. What does bert look at? an analysis of bert’s attention. arXiv preprint arXiv:1906.04341, 2019.
  10. Robust statistics in data analysis—a review: Basic concepts. Chemometrics and intelligent laboratory systems, 85(2):203–219, 2007.
  11. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  12. Learning from labeled features using generalized expectation criteria. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 595–602, 2008.
  13. Autofs: Automated feature selection via diversity-aware interactive reinforcement learning. In 2020 IEEE International Conference on Data Mining (ICDM), pages 1008–1013. IEEE, 2020.
  14. Scalable online planning via reinforcement learning fine-tuning. Advances in Neural Information Processing Systems, 34:16951–16963, 2021.
  15. Data preprocessing in data mining, volume 72. Springer, 2015.
  16. New fuzzy c-means clustering method based on feature-weight and cluster-weight learning. Applied Soft Computing, 78:324–345, 2019.
  17. Sooyoung Her. Smoking and drinking dataset. Kaggle, 2023.
  18. Tabtransformer: Tabular data modeling using contextual embeddings. arXiv preprint arXiv:2012.06678, 2020.
  19. Few-shot object detection via feature reweighting. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8420–8429, 2019.
  20. Calculating feature weights in naive bayes with kullback-leibler measure. In 2011 IEEE 11th International Conference on data mining, pages 1146–1151. IEEE, 2011.
  21. Text classification by labeling words. In Aaai, volume 4, pages 425–430, 2004.
  22. Automating feature subspace exploration via multi-agent reinforcement learning. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 207–215, 2019.
  23. Zhi Liu. Amazon Commerce reviews set. UCI Machine Learning Repository, 2011.
  24. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025, 2015.
  25. Analysis and application of normalization methods with supervised feature weighting to improve k-means accuracy. In 14th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2019) Seville, Spain, May 13–15, 2019, Proceedings 14, pages 14–24. Springer, 2020.
  26. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
  27. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
  28. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551, 2020.
  29. Active learning with feedback on features and instances. The Journal of Machine Learning Research, 7:1655–1686, 2006.
  30. C. Sakar and Yomi Kastro. Online Shoppers Purchasing Intention Dataset. UCI Machine Learning Repository, 2018. DOI: https://doi.org/10.24432/C5F88Q.
  31. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  32. Claude Elwood Shannon. A mathematical theory of communication. The Bell system technical journal, 27(3):379–423, 1948.
  33. Multimodal emotion recognition with transformer-based self supervised feature fusion. IEEE Access, 8:176274–176285, 2020.
  34. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  35. Improving fuzzy c-means clustering based on feature-weight learning. Pattern recognition letters, 25(10):1123–1132, 2004.
  36. Active svm-based relevance feedback using multiple classifiers ensemble and features reweighting. Engineering Applications of Artificial Intelligence, 26(1):368–381, 2013.
  37. Semi-supervised learning for k-dependence bayesian classifiers. Applied Intelligence, pages 1–19, 2022.
  38. An external attention-based feature ranker for large-scale feature selection. Knowledge-Based Systems, 281:111084, 2023.
  39. Daniel S. Yeung and XZ Wang. Improving performance of similarity-based clustering by feature weight learning. IEEE transactions on pattern analysis and machine intelligence, 24(4):556–561, 2002.
  40. Tw-co-k-means: Two-level weighted collaborative k-means for multi-view clustering. Knowledge-Based Systems, 150:127–138, 2018.
  41. A feature selection algorithm of decision tree based on feature weight. Expert Systems with Applications, 164:113842, 2021.
  42. Feature learning network with transformer for multi-label image classification. Pattern Recognition, 136:109203, 2023.
  43. Heterogeneous teaching evaluation network based offline course recommendation with graph learning and tensor factorization. Neurocomputing, 415:84–95, 2020.
  44. WinGNN: dynamic graph neural networks with random gradient aggregation window. In The 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023. ACM, 2023.
  45. Recommending learning objects through attentive heterogeneous graph convolution and operation-aware neural network. IEEE Transactions on Knowledge and Data Engineering, 35(4):4178–4189, 2023.
  46. Fine-tuning language models from human preferences. arXiv preprint arXiv:1909.08593, 2019.
Citations (10)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 1 like.

Upgrade to Pro to view all of the tweets about this paper: