Optimizing Feature Selection for Binary Classification with Noisy Labels: A Genetic Algorithm Approach (2401.06546v1)
Abstract: Feature selection in noisy label scenarios remains an understudied topic. We propose a novel genetic algorithm-based approach, the Noise-Aware Multi-Objective Feature Selection Genetic Algorithm (NMFS-GA), for selecting optimal feature subsets in binary classification with noisy labels. NMFS-GA offers a unified framework for selecting feature subsets that are both accurate and interpretable. We evaluate NMFS-GA on synthetic datasets with label noise, a Breast Cancer dataset enriched with noisy features, and a real-world ADNI dataset for dementia conversion prediction. Our results indicate that NMFS-GA can effectively select feature subsets that improve the accuracy and interpretability of binary classifiers in scenarios with noisy labels.
- W. Siblini, P. Kuntz, and F. Meyer, “A review on dimensionality reduction for multi-label classification,” IEEE Transactions on Knowledge and Data Engineering, vol. 33, no. 3, pp. 839–857, 2019.
- V. Pappu and P. M. Pardalos, “High-dimensional data classification,” Clusters, Orders, and Trees: Methods and Applications, pp. 119–150, 2014.
- H. Liu and L. Yu, “Toward integrating feature selection algorithms for classification and clustering,” IEEE Trans. Knowl. Data Eng, vol. 17, no. 4, pp. 491–502, 2005.
- M. Pan, Z. Sun, C. Wang, and G. Cao, “A multi-label feature selection method based on an approximation of interaction information,” Intelligent Data Analysis, vol. 26, no. 4, pp. 823–840, 2022.
- L. Santos-Mayo, L. M. San-José-Revuelta, and J. I. Arribas, “A computer-aided diagnosis system with eeg based on the p3b wave during an auditory odd-ball task in schizophrenia,” IEEE. Trans. Biomed, vol. 64, no. 2, pp. 395–407, 2016.
- H. Liu, R. Setiono, et al., “A probabilistic approach to feature selection-a filter solution,” in ICML, vol. 96, pp. 319–327, 1996.
- C. Sevilla-Salcedo, V. Imani, P. M. Olmos, V. Gómez-Verdejo, J. Tohka, A. D. N. Initiative, et al., “Multi-task longitudinal forecasting with missing values on alzheimer’s disease,” Computer Methods and Programs in Biomedicine, vol. 226, p. 107056, 2022.
- V. Imani, M. Prakash, M. Zare, and J. Tohka, “Comparison of single and multitask learning for predicting cognitive decline based on mri data,” IEEE Access, vol. 9, pp. 154275–154291, 2021.
- E. Tuv, A. Borisov, G. Runger, and K. Torkkola, “Feature selection with ensembles, artificial variables, and redundancy elimination,” The Journal of Machine Learning Research, vol. 10, pp. 1341–1366, 2009.
- V. Imani, C. Sevilla-Salcedo, V. Fortino, and J. Tohka, “Multi-objective genetic algorithm for multi-view feature selection,” arXiv preprint arXiv:2305.18352, 2023.
- M. M. Mafarja and S. Mirjalili, “Hybrid whale optimization algorithm with simulated annealing for feature selection,” Neurocomputing, vol. 260, pp. 302–312, 2017.
- X. Zhu, X. Wu, and Q. Chen, “Eliminating class noise in large datasets,” in Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 920–927, 2003.
- Y. Wang, X. Ma, Z. Chen, Y. Luo, J. Yi, and J. Bailey, “Symmetric cross entropy for robust learning with noisy labels,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 322–330, 2019.
- X. Zhou, X. Liu, D. Zhai, J. Jiang, and X. Ji, “Asymmetric loss functions for noise-tolerant learning: Theory and applications,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- C. Gong, Y. Ding, B. Han, G. Niu, J. Yang, J. You, D. Tao, and M. Sugiyama, “Class-wise denoising for robust learning under label noise,” IEEE Trans. Pattern Anal. Mach. Intell, vol. 45, no. 3, pp. 2835–2848, 2022.
- N. Natarajan, I. S. Dhillon, P. K. Ravikumar, and A. Tewari, “Learning with noisy labels,” Advances in neural information processing systems, vol. 26, 2013.
- K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: Nsga-ii,” IEEE transactions on evolutionary computation, vol. 6, no. 2, pp. 182–197, 2002.
- C. E. Shannon, “A mathematical theory of communication,” The Bell system technical journal, vol. 27, no. 3, pp. 379–423, 1948.
- A. Ghosh, H. Kumar, and P. S. Sastry, “Robust loss functions under label noise for deep neural networks,” in Proceedings of the AAAI conference on AI, vol. 31, 2017.
- Z. Zhang and M. Sabuncu, “Generalized cross entropy loss for training deep neural networks with noisy labels,” Advances in neural information processing systems, vol. 31, 2018.
- D. Tanaka, D. Ikami, T. Yamasaki, and K. Aizawa, “Joint optimization framework for learning with noisy labels,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5552–5560, 2018.
- Y. Liu and H. Guo, “Peer loss functions: Learning from noisy labels without knowing noise rates,” in ICML, pp. 6226–6236, PMLR, 2020.
- P. Křížek, J. Kittler, and V. Hlaváč, “Improving stability of feature selection methods,” in International conference on computer analysis of images and patterns, pp. 929–936, Springer, 2007.
- W. Wolberg, O. Mangasarian, N. Street, and W. Street, “Breast Cancer Wisconsin (Diagnostic).” UCI Machine Learning Repository, 1995.
- E. Moradi, A. Pepe, C. Gaser, H. Huttunen, and J. Tohka, “Machine learning framework for early mri-based alzheimer’s conversion prediction in mci subjects,” Neuroimage, vol. 104, pp. 398–412, 2015.
- C. Gaser, R. Dahnke, P. M. Thompson, F. Kurth, E. Luders, and A. D. N. Initiative, “Cat–a computational anatomy toolbox for the analysis of structural mri data,” biorxiv, pp. 2022–06, 2022.
- D. S. Marcus, T. H. Wang, J. Parker, J. G. Csernansky, J. C. Morris, and R. L. Buckner, “Open access series of imaging studies (oasis): cross-sectional mri data in young, middle aged, nondemented, and demented older adults,” Journal of cognitive neuroscience, vol. 19, no. 9, pp. 1498–1507, 2007.