Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 88 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 15 tok/s
GPT-5 High 16 tok/s Pro
GPT-4o 105 tok/s
GPT OSS 120B 471 tok/s Pro
Kimi K2 202 tok/s Pro
2000 character limit reached

MEL: Efficient Multi-Task Evolutionary Learning for High-Dimensional Feature Selection (2402.08982v1)

Published 14 Feb 2024 in cs.LG, cs.AI, and cs.NE

Abstract: Feature selection is a crucial step in data mining to enhance model performance by reducing data dimensionality. However, the increasing dimensionality of collected data exacerbates the challenge known as the "curse of dimensionality", where computation grows exponentially with the number of dimensions. To tackle this issue, evolutionary computational (EC) approaches have gained popularity due to their simplicity and applicability. Unfortunately, the diverse designs of EC methods result in varying abilities to handle different data, often underutilizing and not sharing information effectively. In this paper, we propose a novel approach called PSO-based Multi-task Evolutionary Learning (MEL) that leverages multi-task learning to address these challenges. By incorporating information sharing between different feature selection tasks, MEL achieves enhanced learning ability and efficiency. We evaluate the effectiveness of MEL through extensive experiments on 22 high-dimensional datasets. Comparing against 24 EC approaches, our method exhibits strong competitiveness. Additionally, we have open-sourced our code on GitHub at https://github.com/wangxb96/MEL.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. E. Glaab, J. M. Garibaldi, and N. Krasnogor, “Arraymining: a modular web-application for microarray analysis combining ensemble and consensus methods with cross-study normalization,” BMC Bioinform., vol. 10, no. 1, pp. 1–7, 2009.
  2. M. L. Bermingham, R. Pong-Wong et al., “Application of high-dimensional feature selection: evaluation for genomic prediction in man,” Sci. Rep., vol. 5, no. 1, pp. 1–12, 2015.
  3. J. Li, K. Cheng, S. Wang, F. Morstatter, R. P. Trevino, J. Tang, and H. Liu, “Feature selection: A data perspective,” ACM Comput Surv., vol. 50, no. 6, pp. 1–45, 2017.
  4. M. A. Kramer, “Nonlinear principal component analysis using autoassociative neural networks,” AIChE J., vol. 37, no. 2, pp. 233–243, 1991.
  5. I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,” J. Mach. Learn. Res., vol. 3, no. Mar, pp. 1157–1182, 2003.
  6. L. Yu and H. Liu, “Feature selection for high-dimensional data: A fast correlation-based filter solution,” in Proceedings of the 20th International Conference on Machine Learning (ICML-03), 2003, pp. 856–863.
  7. T. M. Phuong, Z. Lin, and R. B. Altman, “Choosing snps using feature selection,” in 2005 IEEE Computational Systems Bioinformatics Conference (CSB’05).   IEEE, 2005, pp. 301–309.
  8. E. Saghapour, S. Kermani, and M. Sehhati, “A novel feature ranking method for prediction of cancer stages using proteomics data,” PLoS One, vol. 12, no. 9, p. e0184203, 2017.
  9. Y. Zhang, Z. Dong et al., “Detection of subjects and brain regions related to alzheimer’s disease using 3d mri scans based on eigenbrain and machine learning,” Front. Comput. Neurosci., vol. 9, p. 66, 2015.
  10. G. Roffo and S. Melzi, “Features selection via eigenvector centrality,” Proceedings of New Frontiers in Mining Complex Patterns, 2016.
  11. R. Kohavi and G. H. John, “Wrappers for feature subset selection,” Artif. Intell., vol. 97, no. 1-2, pp. 273–324, 1997.
  12. A. Figueroa and G. Neumann, “Learning to rank effective paraphrases from query logs for community question answering,” in Twenty-Seventh AAAI Conference on Artif. Intell., 2013.
  13. F. N. Alenezi, “Majority scoring with backward elimination in pls for high dimensional spectrum data,” Sci. Rep., vol. 11, no. 1, pp. 1–11, 2021.
  14. I.-S. Oh, J.-S. Lee, and B.-R. Moon, “Hybrid genetic algorithms for feature selection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 11, pp. 1424–1437, 2004.
  15. W. Ma, X. Zhou et al., “A two-stage hybrid ant colony optimization for high-dimensional feature selection,” Pattern Recognit., vol. 116, p. 107933, 2021.
  16. X. Wang and W. Jia, “A feature weighting particle swarm optimization method to identify biomarker genes,” in 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).   IEEE, 2022, pp. 830–834.
  17. J. Zhang, D. Xu, K. Hao, Y. Zhang, W. Chen, J. Liu, R. Gao, C. Wu, and Y. De Marinis, “Fs–gbdt: identification multicancer-risk module via a feature selection algorithm by integrating fisher score and gbdt,” Brief. Bioinformatics, vol. 22, no. 3, p. bbaa189, 2021.
  18. F. R. Bach, “Bolasso: model consistent lasso estimation through the bootstrap,” in Proceedings of the 25th International Conference on Machine Learning, 2008, pp. 33–40.
  19. Y. Xue, B. Xue, and M. Zhang, “Self-adaptive particle swarm optimization for large-scale feature selection in classification,” ACM Trans. Knowl. Discov. Data, vol. 13, no. 5, pp. 1–27, 2019.
  20. X. Wang, Y. Wang, K.-C. Wong, and X. Li, “A self-adaptive weighted differential evolution approach for large-scale feature selection,” Knowl. Based Syst., vol. 235, p. 107633, 2022.
  21. J. Zhou, Q. Wu, M. Zhou, J. Wen et al., “Lagam: A length-adaptive genetic algorithm with markov blanket for high-dimensional feature selection in classification,” IEEE Trans. Cybern., 2022.
  22. B. Xue, M. Zhang, and W. N. Browne, “Particle swarm optimization for feature selection in classification: A multi-objective approach,” IEEE Trans. Cybern., vol. 43, no. 6, pp. 1656–1671, 2012.
  23. Y. Zhang and Q. Yang, “A survey on multi-task learning,” IEEE Trans. Knowl. Data Eng., 2021.
  24. J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proceedings of ICNN’95-international Conference on Neural Networks, vol. 4.   IEEE, 1995, pp. 1942–1948.
  25. Y. Prasad and K. Biswas, “Gene selection in microarray datasets using progressively refined pso scheme,” in AAAI Conf. Artif. Intell., vol. 29, no. 1, 2015.
  26. Y. Cheng, Z. Jin, C. Hao, and X. Li, “Illumination invariant face recognition with particle swarm optimization,” in 2014 IEEE International Conference on Data Mining Workshop.   IEEE, 2014, pp. 862–866.
  27. Y. Hu, Y. Zhang, and D. Gong, “Multiobjective particle swarm optimization for feature selection with fuzzy cost,” IEEE Trans. Cybern., vol. 51, no. 2, pp. 874–888, 2020.
  28. K. Mistry, L. Zhang, S. C. Neoh, C. P. Lim, and B. Fielding, “A micro-ga embedded pso feature selection approach to intelligent facial emotion recognition,” IEEE Trans. Cybern., vol. 47, no. 6, pp. 1496–1509, 2016.
  29. R. Hu and A. Singh, “Unit: Multimodal multitask learning with a unified transformer,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 1439–1449.
  30. T. Yu, S. Kumar, A. Gupta, S. Levine, K. Hausman, and C. Finn, “Gradient surgery for multi-task learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 5824–5836, 2020.
  31. Y. Chen, Y. Lin, T. Gai, Y. Su, Y. Wei, and D. Z. Pan, “Semisupervised hotspot detection with self-paced multitask learning,” IEEE T. COMPUT. AID. D., vol. 39, no. 7, pp. 1511–1523, 2020.
  32. Y. Zhang, B. Guo, J. Liu, T. Guo, Y. Ouyang, and Z. Yu, “Which app is going to die? a framework for app survival prediction with multitask learning,” IEEE Trans. Mob. Comput., vol. 21, no. 2, pp. 728–739, 2020.
  33. S. Alqahtani, A. Mishra, and M. Diab, “A multitask learning approach for diacritic restoration,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 8238–8247.
  34. L. Chen, H. Zhang et al., “Multi-label nonlinear matrix completion with transductive multi-task feature selection for joint mgmt and idh1 status prediction of patient with high-grade gliomas,” IEEE Trans. Med. Imaging, vol. 37, no. 8, pp. 1775–1787, 2018.
  35. C. Liu, C.-T. Zheng, S. Wu, Z. Yu, and H.-S. Wong, “Multitask feature selection by graph-clustered feature sharing,” IEEE Trans. Cybern., vol. 50, no. 1, pp. 74–86, 2018.
  36. F. Zhang, Y. Mei, S. Nguyen, K. C. Tan, and M. Zhang, “Task relatedness based multitask genetic programming for dynamic flexible job shop scheduling,” IEEE Trans. Evol. Comput., 2022.
  37. S. Liu, Q. Lin et al., “Evolutionary multitasking for large-scale multiobjective optimization,” IEEE Trans. Evol. Comput., 2022.
  38. C. He, Y. Zhang, D. Gong et al., “A multitask bee colony band selection algorithm with variable-size clustering for hyperspectral images,” IEEE Trans. Evol. Comput., vol. 26, no. 6, pp. 1566–1580, 2022.
  39. Y.-J. Zheng, X. Chen, Q. Song, J. Yang, and L. Wang, “Evolutionary optimization of covid-19 vaccine distribution with evolutionary demands,” IEEE Trans. Evol. Comput., 2022.
  40. N. Zhang, A. Gupta, Z. Chen, and Y.-S. Ong, “Evolutionary machine learning with minions: A case study in feature selection,” IEEE Trans. Evol. Comput., vol. 26, no. 1, pp. 130–144, 2021.
  41. K. Chen, B. Xue, M. Zhang, and F. Zhou, “An evolutionary multitasking-based feature selection method for high-dimensional classification,” IEEE Trans. Cybern., vol. 52, no. 7, pp. 7172–7186, 2020.
  42. K. Chen, B. Xue et al., “Evolutionary multitasking for feature selection in high-dimensional classification via particle swarm optimization,” IEEE Trans. Evol. Comput., vol. 26, no. 3, pp. 446–460, 2021.
  43. F. Kılıç, Y. Kaya, and S. Yildirim, “A novel multi population based particle swarm optimization for feature selection,” Knowl. Based Syst., vol. 219, p. 106894, 2021.
  44. K. Yu, W. Xie, L. Wang, and W. Li, “Ilrc: a hybrid biomarker discovery algorithm based on improved l1 regularization and clustering in microarray data,” BMC Bioinform., vol. 22, no. 1, pp. 1–19, 2021.
  45. X. Li, S. Li, Y. Wang, S. Zhang, and K.-C. Wong, “Identification of pan-cancer ras pathway activation with deep learning,” Brief. Bioinformatics, vol. 22, no. 4, p. bbaa258, 2021.
  46. A. Kumar and S. Bawa, “A comparative review of meta-heuristic approaches to optimize the sla violation costs for dynamic execution of cloud services,” Soft Comput., vol. 24, no. 6, pp. 3909–3922, 2020.
  47. D. Karaboga and B. Basturk, “A powerful and efficient algorithm for numerical function optimization: artificial bee colony (abc) algorithm,” J. Glob. Optim., vol. 39, no. 3, pp. 459–471, 2007.
  48. M. H. Aghdam, N. Ghasem-Aghaee, and M. E. Basiri, “Text feature selection using ant colony optimization,” Expert Syst. Appl., vol. 36, no. 3, pp. 6843–6853, 2009.
  49. Y. Shi and R. Eberhart, “A modified particle swarm optimizer,” in 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No. 98TH8360).   IEEE, 1998, pp. 69–73.
  50. G.-G. Wang et al., “Monarch butterfly optimization,” Neural. Comput. Appl., vol. 31, no. 7, pp. 1995–2014, 2019.
  51. X.-S. Yang, “A new metaheuristic bat-inspired algorithm,” in Nature Inspired Cooperative Strategies for Optimization (NICSO 2010).   Springer, 2010, pp. 65–74.
  52. X.-S. Yang and S. Deb, “Cuckoo search via lévy flights,” in 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC).   IEEE, 2009, pp. 210–214.
  53. X.-S. Yang, “Firefly algorithm, stochastic test functions and design optimisation,” arXiv preprint arXiv:1003.1409, 2010.
  54. X.-S. Yang, M. Karamanoglu, and X. He, “Flower pollination algorithm: a novel approach for multiobjective optimization,” Eng. Optim., vol. 46, no. 9, pp. 1222–1237, 2014.
  55. R. Storn and K. Price, “Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces,” J. Glob. Optim., vol. 11, no. 4, pp. 341–359, 1997.
  56. C.-L. Huang and C.-J. Wang, “A ga-based feature selection and parameters optimizationfor support vector machines,” Expert Syst. Appl., vol. 31, no. 2, pp. 231–240, 2006.
  57. W.-T. Pan, “A new fruit fly optimization algorithm: taking the financial distress model as an example,” Knowl. Based Syst., vol. 26, pp. 69–74, 2012.
  58. S. Mirjalili, S. M. Mirjalili, and A. Lewis, “Grey wolf optimizer,” Adv. Eng. Softw., vol. 69, pp. 46–61, 2014.
  59. A. A. Heidari, S. Mirjalili, H. Faris, I. Aljarah, M. Mafarja, and H. Chen, “Harris hawks optimization: Algorithm and applications,” Future Gener. Comput. Syst., vol. 97, pp. 849–872, 2019.
  60. S. Kirkpatrick, C. D. Gelatt Jr, and M. P. Vecchi, “Optimization by simulated annealing,” Science, vol. 220, no. 4598, pp. 671–680, 1983.
  61. Z. W. Geem, J. H. Kim, and G. V. Loganathan, “A new heuristic optimization algorithm: harmony search,” Simulation, vol. 76, no. 2, pp. 60–68, 2001.
  62. E. Rashedi, H. Nezamabadi-Pour, and S. Saryazdi, “Gsa: a gravitational search algorithm,” Inf. Sci., vol. 179, no. 13, pp. 2232–2248, 2009.
  63. S. Mirjalili, S. M. Mirjalili, and A. Hatamlou, “Multi-verse optimizer: a nature-inspired algorithm for global optimization,” Neural. Comput. Appl., vol. 27, no. 2, pp. 495–513, 2016.
  64. F. Cheng, J. Cui, and et al., “A variable granularity search-based multiobjective feature selection algorithm for high-dimensional data classification,” IEEE Trans. Evol. Comput., vol. 27, no. 2, pp. 266–280, 2022.
  65. E. Hancer, B. Xue, and M. Zhang, “An evolutionary filter approach to feature selection in classification for both single-and multi-objective scenarios,” Knowl. Based Syst., vol. 280, p. 111008, 2023.
  66. S. Mirjalili and A. Lewis, “The whale optimization algorithm,” Adv. Eng. Softw., vol. 95, pp. 51–67, 2016.
Citations (5)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.