Towards Multi-Objective High-Dimensional Feature Selection via Evolutionary Multitasking (2401.01563v1)
Abstract: Evolutionary Multitasking (EMT) paradigm, an emerging research topic in evolutionary computation, has been successfully applied in solving high-dimensional feature selection (FS) problems recently. However, existing EMT-based FS methods suffer from several limitations, such as a single mode of multitask generation, conducting the same generic evolutionary search for all tasks, relying on implicit transfer mechanisms through sole solution encodings, and employing single-objective transformation, which result in inadequate knowledge acquisition, exploitation, and transfer. To this end, this paper develops a novel EMT framework for multiobjective high-dimensional feature selection problems, namely MO-FSEMT. In particular, multiple auxiliary tasks are constructed by distinct formulation methods to provide diverse search spaces and information representations and then simultaneously addressed with the original task through a multi-slover-based multitask optimization scheme. Each task has an independent population with task-specific representations and is solved using separate evolutionary solvers with different biases and search preferences. A task-specific knowledge transfer mechanism is designed to leverage the advantage information of each task, enabling the discovery and effective transmission of high-quality solutions during the search process. Comprehensive experimental results demonstrate that our MO-FSEMT framework can achieve overall superior performance compared to the state-of-the-art FS methods on 26 datasets. Moreover, the ablation studies verify the contributions of different components of the proposed MO-FSEMT.
- Z. Yu, Y. Zhang, J. You, C. P. Chen, H.-S. Wong, G. Han, and J. Zhang, “Adaptive semi-supervised classifier ensemble for high dimensional data classification,” IEEE transactions on cybernetics, vol. 49, no. 2, pp. 366–379, 2017.
- J. Tang, S. Alelyani, and H. Liu, “Feature selection for classification: A review,” Data classification: Algorithms and applications, p. 37, 2014.
- M. Robnik-Šikonja and I. Kononenko, “Theoretical and empirical analysis of relieff and rrelieff,” Machine learning, vol. 53, pp. 23–69, 2003.
- M. Kudo and J. Sklansky, “Comparison of algorithms that select features for pattern classifiers,” Pattern recognition, vol. 33, no. 1, pp. 25–41, 2000.
- Y. Zhou, W. Zhang, J. Kang, X. Zhang, and X. Wang, “A problem-specific non-dominated sorting genetic algorithm for supervised feature selection,” Information Sciences, vol. 547, pp. 841–859, 2021.
- F. Cheng, F. Chu, Y. Xu, and L. Zhang, “A steering-matrix-based multiobjective evolutionary algorithm for high-dimensional feature selection,” IEEE transactions on cybernetics, vol. 52, no. 9, pp. 9695–9708, 2021.
- S. Wang and W. Zhu, “Sparse graph embedding unsupervised feature selection,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 48, no. 3, pp. 329–341, 2016.
- H. Liu, M. Zhou, and Q. Liu, “An embedded feature selection method for imbalanced data classification,” IEEE/CAA Journal of Automatica Sinica, vol. 6, no. 3, pp. 703–715, 2019.
- P. Bermejo, J. A. Gámez, and J. M. Puerta, “A grasp algorithm for fast hybrid (filter-wrapper) feature subset selection in high-dimensional datasets,” Pattern Recognition Letters, vol. 32, no. 5, pp. 701–711, 2011.
- Q. Song, J. Ni, and G. Wang, “A fast clustering-based feature subset selection algorithm for high-dimensional data,” IEEE transactions on knowledge and data engineering, vol. 25, no. 1, pp. 1–14, 2011.
- X.-F. Song, Y. Zhang, D.-W. Gong, and X.-Z. Gao, “A fast hybrid feature selection based on correlation-guided clustering and particle swarm optimization for high-dimensional data,” IEEE Transactions on Cybernetics, vol. 52, no. 9, pp. 9573–9586, 2021.
- G. Chandrashekar and F. Sahin, “A survey on feature selection methods,” Computers & Electrical Engineering, vol. 40, no. 1, pp. 16–28, 2014.
- B. Xue, M. Zhang, W. N. Browne, and X. Yao, “A survey on evolutionary computation approaches to feature selection,” IEEE Transactions on evolutionary computation, vol. 20, no. 4, pp. 606–626, 2015.
- J. H. Holland, “Genetic algorithms,” Scientific american, vol. 267, no. 1, pp. 66–73, 1992.
- S. Das and P. N. Suganthan, “Differential evolution: A survey of the state-of-the-art,” IEEE Transactions on Evolutionary Computation, vol. 15, no. 1, pp. 4–31, 2010.
- J. Yang and V. Honavar, “Feature subset selection using a genetic algorithm,” IEEE Intelligent Systems and their Applications, vol. 13, no. 2, pp. 44–49, 1998.
- C. Khammassi and S. Krichen, “A ga-lr wrapper approach for feature selection in network intrusion detection,” computers & security, vol. 70, pp. 255–277, 2017.
- R. N. Khushaba, A. Al-Ani, and A. Al-Jumaily, “Feature subset selection using differential evolution and a statistical repair mechanism,” Expert Systems with Applications, vol. 38, no. 9, pp. 11 515–11 526, 2011.
- B. Tran, B. Xue, and M. Zhang, “A new representation in pso for discretization-based feature selection,” IEEE Transactions on Cybernetics, vol. 48, no. 6, pp. 1733–1746, 2017.
- X.-F. Song, Y. Zhang, Y.-N. Guo, X.-Y. Sun, and Y.-L. Wang, “Variable-size cooperative coevolutionary particle swarm optimization for feature selection on high-dimensional data,” IEEE Transactions on Evolutionary Computation, vol. 24, no. 5, pp. 882–895, 2020.
- B. Tran, B. Xue, and M. Zhang, “Variable-length particle swarm optimization for feature selection on high-dimensional classification,” IEEE Transactions on Evolutionary Computation, vol. 23, no. 3, pp. 473–487, 2018.
- Y. Zhang, Y.-H. Wang, D.-W. Gong, and X.-Y. Sun, “Clustering-guided particle swarm feature selection algorithm for high-dimensional imbalanced data with missing values,” IEEE Transactions on Evolutionary Computation, vol. 26, no. 4, pp. 616–630, 2021.
- K. Chen, B. Xue, M. Zhang, and F. Zhou, “An evolutionary multitasking-based feature selection method for high-dimensional classification,” IEEE Transactions on Cybernetics, vol. 52, no. 7, pp. 7172–7186, 2020.
- ——, “Evolutionary multitasking for feature selection in high-dimensional classification via particle swarm optimization,” IEEE Transactions on Evolutionary Computation, vol. 26, no. 3, pp. 446–460, 2021.
- L. Li, M. Xuan, Q. Lin, M. Jiang, Z. Ming, and K. C. Tan, “An evolutionary multitasking algorithm with multiple filtering for high-dimensional feature selection,” IEEE Transactions on Evolutionary Computation, 2023.
- B. Xue, M. Zhang, and W. N. Browne, “Particle swarm optimization for feature selection in classification: A multi-objective approach,” IEEE transactions on cybernetics, vol. 43, no. 6, pp. 1656–1671, 2012.
- A. Gupta, Y. Ong, and L. Feng, “Multifactorial evolution: Toward evolutionary multitasking,” IEEE Transactions on Evolutionary Computation, vol. 20, no. 3, pp. 343–357, June 2016.
- L. Feng, L. Zhou, J. Zhong, A. Gupta, Y.-S. Ong, K.-C. Tan, and A. Qin, “Evolutionary multitasking via explicit autoencoding,” IEEE Transactions on Cybernetics, vol. 49, no. 9, pp. 3457–3470, 2018.
- K. C. Tan, L. Feng, and M. Jiang, “Evolutionary transfer optimization-a new frontier in evolutionary computation research,” IEEE Computational Intelligence Magazine, vol. 16, no. 1, pp. 22–33, 2021.
- W. Siedlecki and J. Sklansky, “A note on genetic algorithms for large-scale feature selection,” Pattern recognition letters, vol. 10, no. 5, pp. 335–347, 1989.
- E. Aličković and A. Subasi, “Breast cancer diagnosis using ga feature selection and rotation forest,” Neural Computing and applications, vol. 28, pp. 753–763, 2017.
- S. Gu, R. Cheng, and Y. Jin, “Feature selection for high-dimensional classification using a competitive swarm optimizer,” Soft Computing, vol. 22, pp. 811–822, 2018.
- K. Deb and H. Jain, “An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part i: solving problems with box constraints,” IEEE transactions on evolutionary computation, vol. 18, no. 4, pp. 577–601, 2013.
- B. H. Nguyen, B. Xue, P. Andreae, H. Ishibuchi, and M. Zhang, “Multiple reference points-based decomposition for multiobjective feature selection in classification: Static and dynamic mechanisms,” IEEE Transactions on Evolutionary Computation, vol. 24, no. 1, pp. 170–184, 2019.
- Y. Hu, Y. Zhang, and D. Gong, “Multiobjective particle swarm optimization for feature selection with fuzzy cost,” IEEE Transactions on Cybernetics, vol. 51, no. 2, pp. 874–888, 2020.
- Y. Tian, X. Zhang, C. Wang, and Y. Jin, “An evolutionary algorithm for large-scale sparse multiobjective optimization problems,” IEEE Transactions on Evolutionary Computation, vol. 24, no. 2, pp. 380–393, 2019.
- A. Gupta, Y.-S. Ong, and L. Feng, “Insights on transfer optimization: Because experience is the best teacher,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 2, no. 1, pp. 51–64, 2017.
- X. Xue, C. Yang, L. Feng, K. Zhang, L. Song, and K. C. Tan, “Solution transfer in evolutionary optimization: An empirical study on sequential transfer,” IEEE Transactions on Evolutionary Computation, 2023, accepted, DOI: 10.1109/TEVC.2023.3339506.
- M. L. McHugh, “The chi-square test of independence,” Biochemia medica, vol. 23, no. 2, pp. 143–149, 2013.
- X. Zhang, Y. Tian, and Y. Jin, “A knee point-driven evolutionary algorithm for many-objective optimization,” IEEE Transactions on Evolutionary Computation, vol. 19, no. 6, pp. 761–776, 2014.
- H. Zille, “Large-scale multi-objective optimisation: new approaches and a classification of the state-of-the-art,” Ph.D. dissertation, Otto von Guericke University Magdeburg, 2019.
- Y. Feng, L. Feng, S. Kwong, and K. C. Tan, “A multivariation multifactorial evolutionary algorithm for large-scale multiobjective optimization,” IEEE Transactions on Evolutionary Computation, vol. 26, no. 2, pp. 248–262, 2021.
- M. Iqbal, B. Xue, H. Al-Sahaf, and M. Zhang, “Cross-domain reuse of extracted knowledge in genetic programming for image classification,” IEEE Transactions on Evolutionary Computation, vol. 21, no. 4, pp. 569–587, 2017.
- R. Cheng and Y. Jin, “A competitive swarm optimizer for large scale optimization,” IEEE Transactions on Cybernetics, vol. 45, no. 2, pp. 191–204, 2014.
- G. Patterson and M. Zhang, “Fitness functions in genetic programming for classification with unbalanced data,” in AI 2007: Advances in Artificial Intelligence: 20th Australian Joint Conference on Artificial Intelligence, Gold Coast, Australia, December 2-6, 2007. Proceedings 20. Springer, 2007, pp. 769–775.
- Z. Yang, Y. Wang, X. Chen, B. Shi, C. Xu, C. Xu, Q. Tian, and C. Xu, “Cars: Continuous evolution for efficient neural architecture search,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 1829–1838.
- K. Deb and D. Deb, “Analysing mutation schemes for real-parameter genetic algorithms,” International Journal of Artificial Intelligence and Soft Computing, vol. 4, no. 1, pp. 1–28, 2014.
- F. Wilcoxon, S. Katti, and R. A. Wilcox, “Critical values and probability levels for the wilcoxon rank sum test and the wilcoxon signed rank test,” Selected Tables in Mathematical Statistics, vol. 1, pp. 171–259, 1970.
- L. Sun, A.-M. Hui, Q. Su, A. Vortmeyer, Y. Kotliarov, S. Pastorino, A. Passaniti, J. Menon, J. Walling, R. Bailey et al., “Neuronal and glioma-derived stem cell factor induces angiogenesis within the brain,” Cancer cell, vol. 9, no. 4, pp. 287–300, 2006.