2000 character limit reached
Graph Convolutional Network-based Feature Selection for High-dimensional and Low-sample Size Data (2211.14144v1)
Published 25 Nov 2022 in cs.LG
Abstract: Feature selection is a powerful dimension reduction technique which selects a subset of relevant features for model construction. Numerous feature selection methods have been proposed, but most of them fail under the high-dimensional and low-sample size (HDLSS) setting due to the challenge of overfitting. In this paper, we present a deep learning-based method - GRAph Convolutional nEtwork feature Selector (GRACES) - to select important features for HDLSS data. We demonstrate empirical evidence that GRACES outperforms other feature selection methods on both synthetic and real-world datasets.
- Genome-wide association studies. Nature Reviews Methods Primers, 1(1):1–21, 2021.
- 1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature, 526(7571):68, 2015.
- Predicting the sequence specificities of dna-and rna-binding proteins by deep learning. Nature biotechnology, 33(8):831–838, 2015.
- Fundamentals of cdna microarray data analysis. TRENDS in Genetics, 19(11):649–659, 2003.
- A practical approach to microarray data analysis. Springer, 2003.
- Rna sequencing and analysis. Cold Spring Harbor Protocols, 2015(11):pdb–top084970, 2015.
- The minimum feature subset selection problem. Journal of Computer Science and Technology, 12(2):145–153, 1997.
- Urszula Stańczyk. Feature evaluation by filter, wrapper, and embedded approaches. In Feature Selection for Data and Pattern Recognition, pages 29–44. Springer, 2015.
- Benchmark for filter methods for feature selection in high-dimensional classification data. Computational Statistics & Data Analysis, 143:106839, 2020.
- Evaluating feature selection strategies for high dimensional, small sample size datasets. In 2011 Annual International conference of the IEEE engineering in medicine and biology society, pages 949–952. IEEE, 2011.
- Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1):267–288, 1996.
- High-dimensional feature selection by feature-wise kernelized lasso. Neural computation, 26(1):185–207, 2014.
- Gradient boosted feature selection. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 522–531, 2014.
- Deep feature selection: theory and application to identify enhancers and promoters. Journal of Computational Biology, 23(5):322–336, 2016.
- Deep neural networks for high dimension, low sample size data. In IJCAI, pages 2287–2293, 2017.
- A comparative evaluation of sequential feature selection algorithms. In Pre-proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics, pages 1–7. PMLR, 1995.
- Ultra high-dimensional nonlinear feature selection for big biological data. IEEE Transactions on Knowledge and Data Engineering, 30(7):1352–1365, 2018.
- Analysis of variance (anova). Chemometrics and intelligent laboratory systems, 6(4):259–272, 1989.
- Donald B Owen. The power of student’s t-test. Journal of the American Statistical Association, 60(309):320–333, 1965.
- Comparing correlated correlation coefficients. Psychological bulletin, 111(1):172, 1992.
- Robin L Plackett. Karl pearson and the chi-squared test. International statistical review/revue internationale de statistique, pages 59–72, 1983.
- Wayne W Daniel. Kolmogorov–smirnov one-sample test. Applied nonparametric statistics, 2, 1990.
- Feature selection on supervised classification using wilks lambda statistic. In 2007 International Symposium on Computational Intelligence and Intelligent Informatics, pages 51–55. IEEE, 2007.
- Frank Wilcoxon. Individual comparisons by ranking methods. In Breakthroughs in statistics, pages 196–202. Springer, 1992.
- Measuring statistical dependence with hilbert-schmidt norms. In International conference on algorithmic learning theory, pages 63–77. Springer, 2005.
- From lasso regression to feature vector machine. Advances in neural information processing systems, 18, 2005.
- Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on pattern analysis and machine intelligence, 27(8):1226–1238, 2005.
- Sparse additive models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(5):1009–1030, 2009.
- Quadratic programming feature selection. Journal of Machine Learning Research, 2010.
- Algorithms for learning kernels based on centered alignment. The Journal of Machine Learning Research, 13:795–828, 2012.
- The group lasso for logistic regression. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(1):53–71, 2008.
- Leo Breiman. Random forests. Machine learning, 45(1):5–32, 2001.
- Learning important features through propagating activation differences. In International conference on machine learning, pages 3145–3153. PMLR, 2017.
- Deeppink: reproducible feature selection in deep neural networks. Advances in neural information processing systems, 31, 2018.
- Afs: An attention-based mechanism for supervised feature selection. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 3705–3713, 2019.
- Kernel feature selection via conditional covariance minimization. Advances in Neural Information Processing Systems, 30, 2017.
- Cancelout: A layer for feature selection in deep neural networks. In International conference on artificial neural networks, pages 72–83. Springer, 2019.
- Deep feature selection using a teacher-student network. Neurocomputing, 383:396–408, 2020.
- Maksymilian Wojtas and Ke Chen. Feature importance ranking for deep learning. Advances in Neural Information Processing Systems, 33:5105–5114, 2020.
- Deep feature screening: Feature selection for ultra high-dimensional data via deep neural networks. arXiv preprint arXiv:2204.01682, 2022.
- Recurrent neural network based feature selection for high dimensional and low sample size micro-array data. In 2019 IEEE International Conference on Big Data (Big Data), pages 4823–4828. IEEE, 2019.
- Gene selection for cancer classification using support vector machines. Machine learning, 46(1):389–422, 2002.
- Similarity network fusion for aggregating data types on a genomic scale. Nature methods, 11(3):333–337, 2014.
- Inductive representation learning on large graphs. Advances in neural information processing systems, 30, 2017.
- Adaptive gaussian noise injection regularization for neural networks. In International Symposium on Neural Networks, pages 176–189. Springer, 2020.
- Noise-trained deep neural networks effectively predict human vision and its neural responses to challenging images. PLoS Biology, 19(12):e3001418, 2021.
- Noisy training for deep neural networks in speech recognition. EURASIP Journal on Audio, Speech, and Music Processing, 2015(1):1–14, 2015.
- Result analysis of the nips 2003 feature selection challenge. Advances in neural information processing systems, 17, 2004.
- Hypergraph neural networks. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 3558–3565, 2019.
- Hypergraph convolution and hypergraph attention. Pattern Recognition, 110:107637, 2021.
- A survey on hyperlink prediction. arXiv preprint arXiv:2207.02911, 2022.