Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Knockoff-Guided Feature Selection via A Single Pre-trained Reinforced Agent (2403.04015v1)

Published 6 Mar 2024 in cs.LG, cs.AI, and stat.ML

Abstract: Feature selection prepares the AI-readiness of data by eliminating redundant features. Prior research falls into two primary categories: i) Supervised Feature Selection, which identifies the optimal feature subset based on their relevance to the target variable; ii) Unsupervised Feature Selection, which reduces the feature space dimensionality by capturing the essential information within the feature set instead of using target variable. However, SFS approaches suffer from time-consuming processes and limited generalizability due to the dependence on the target variable and downstream ML tasks. UFS methods are constrained by the deducted feature space is latent and untraceable. To address these challenges, we introduce an innovative framework for feature selection, which is guided by knockoff features and optimized through reinforcement learning, to identify the optimal and effective feature subset. In detail, our method involves generating "knockoff" features that replicate the distribution and characteristics of the original features but are independent of the target variable. Each feature is then assigned a pseudo label based on its correlation with all the knockoff features, serving as a novel metric for feature evaluation. Our approach utilizes these pseudo labels to guide the feature selection process in 3 novel ways, optimized by a single reinforced agent: 1). A deep Q-network, pre-trained with the original features and their corresponding pseudo labels, is employed to improve the efficacy of the exploration process in feature selection. 2). We introduce unsupervised rewards to evaluate the feature subset quality based on the pseudo labels and the feature space reconstruction loss to reduce dependencies on the target variable. 3). A new {\epsilon}-greedy strategy is used, incorporating insights from the pseudo labels to make the feature selection process more effective.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Roberto Battiti. 1994. Using mutual information for selecting features in supervised neural net learning. IEEE Transactions on neural networks 5, 4 (1994), 537–550.
  2. Greedy layer-wise training of deep networks. Advances in neural information processing systems 19 (2006).
  3. Analysis of explainers of black box deep neural networks for computer vision: A survey. Machine Learning and Knowledge Extraction 3, 4 (2021), 966–989.
  4. Unsupervised feature selection for multi-cluster data. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. 333–342.
  5. Panning for gold:‘model-X’knockoffs for high dimensional controlled variable selection. Journal of the Royal Statistical Society Series B: Statistical Methodology 80, 3 (2018), 551–577.
  6. A weighted principal component analysis and its application to gene expression data. IEEE/ACM Transactions on Computational Biology and Bioinformatics 8, 1 (2009), 246–252.
  7. False discovery rate control via data splitting. J. Amer. Statist. Assoc. (2022), 1–18.
  8. Statistics and data analysis in geology. Vol. 646. Wiley New York.
  9. Pattern classification. John Wiley & Sons.
  10. Knockoffs for the mass: new feature importance statistics with false discovery guarantees. In The 22nd international conference on artificial intelligence and statistics. PMLR, 2125–2133.
  11. Corrado W Gini. 1912. Variability and mutability, contribution to the study of statistical distributions and relations. Studi Economico-Giuridici della R. Universita de Cagliari (1912).
  12. Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products. Chemometrics and intelligent laboratory systems 83, 2 (2006), 83–90.
  13. Mark A Hall and Lloyd A Smith. 1999. Feature selection for machine learning: comparing a correlation-based filter approach to the wrapper. In Proceedings of the twelfth international Florida artificial intelligence research society conference. 235–239.
  14. Laplacian score for feature selection. Advances in neural information processing systems 18 (2005).
  15. Jeremy Howard. 2023. Kaggle Dataset Download. [EB/OL]. https://www.kaggle.com/datasets
  16. IMUFS: Complementary and Consensus Learning-Based Incomplete Multi-View Unsupervised Feature Selection. IEEE Transactions on Knowledge and Data Engineering (2023).
  17. KnockoffGAN: Generating knockoffs for feature selection using generative adversarial networks. In International conference on learning representations.
  18. David D Lewis. 1992. Feature selection and feature extraction for text categorization. In Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, February 23-26, 1992.
  19. Feature selection: A data perspective. ACM computing surveys (CSUR) 50, 6 (2017), 1–45.
  20. Unsupervised feature selection using nonnegative spectral analysis. In Proceedings of the AAAI conference on artificial intelligence, Vol. 26. 1026–1032.
  21. Huan Liu and Rudy Setiono. 1995. Chi2: Feature selection and discretization of numeric attributes. In Proceedings of 7th IEEE international conference on tools with artificial intelligence. Ieee, 388–391.
  22. Automating feature subspace exploration via multi-agent reinforcement learning. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 207–215.
  23. Interactive reinforced feature selection with traverse strategy. Knowledge and Information Systems 65, 5 (2023), 1935–1962.
  24. Efficient reinforced feature selection via early stopping traverse strategy. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE, 399–408.
  25. Ying Liu and Cheng Zheng. 2018. Auto-encoding knockoff generator for fdr controlled variable selection. arXiv preprint arXiv:1809.10765 (2018).
  26. Jianyu Miao and Lingfeng Niu. 2016. A survey on feature selection. Procedia computer science 91 (2016), 919–926.
  27. Efficient and robust feature selection via joint ℓℓ\ellroman_ℓ2, 1-norms minimization. Advances in neural information processing systems 23 (2010).
  28. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on pattern analysis and machine intelligence 27, 8 (2005), 1226–1238.
  29. Public. 2023a. CPLM Dataset Download. [EB/OL]. https://cplm.biocuckoo.cn/
  30. Public. 2023b. OpenML Dataset Download. [EB/OL]. https://www.openml.org
  31. Public. 2023c. UCI Dataset Download. [EB/OL]. https://archive.ics.uci.edu/
  32. Gene hunting with hidden Markov model knockoffs. Biometrika 106, 1 (2019), 1–18.
  33. Feature selection using decision tree and classification through proximal support vector machine for fault diagnostics of roller bearing. Mechanical systems and signal processing 21, 2 (2007), 930–942.
  34. Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.
  35. Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology 58, 1 (1996), 267–288.
  36. Feature selection by maximizing independent classification information. IEEE transactions on knowledge and data engineering 29, 4 (2017), 828–841.
  37. A Wayne Whitney. 1971. A direct method of nonparametric measurement selection. IEEE transactions on computers 100, 9 (1971), 1100–1103.
  38. Beyond discrete selection: Continuous embedding space optimization for generative feature selection. In 2023 IEEE International Conference on Data Mining (ICDM). IEEE, 688–697.
  39. Lei Yu and Huan Liu. 2003. Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proceedings of the 20th international conference on machine learning (ICML-03). 856–863.
  40. Simplifying reinforced feature selection via restructured choice strategy of single agent. In 2020 IEEE International conference on data mining (ICDM). IEEE, 871–880.
  41. Zheng Zhao and Huan Liu. 2007. Spectral feature selection for supervised and unsupervised learning. In Proceedings of the 24th international conference on Machine learning. 1151–1157.
  42. 1-norm support vector machines. Advances in neural information processing systems 16 (2003).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xinyuan Wang (34 papers)
  2. Dongjie Wang (53 papers)
  3. Wangyang Ying (19 papers)
  4. Rui Xie (59 papers)
  5. Haifeng Chen (99 papers)
  6. Yanjie Fu (93 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets