Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FMLFS: A Federated Multi-Label Feature Selection Based on Information Theory in IoT Environment (2405.00524v2)

Published 1 May 2024 in cs.LG, cs.IT, cs.NI, and math.IT

Abstract: In certain emerging applications such as health monitoring wearable and traffic monitoring systems, Internet-of-Things (IoT) devices generate or collect a huge amount of multi-label datasets. Within these datasets, each instance is linked to a set of labels. The presence of noisy, redundant, or irrelevant features in these datasets, along with the curse of dimensionality, poses challenges for multi-label classifiers. Feature selection (FS) proves to be an effective strategy in enhancing classifier performance and addressing these challenges. Yet, there is currently no existing distributed multi-label FS method documented in the literature that is suitable for distributed multi-label datasets within IoT environments. This paper introduces FMLFS, the first federated multi-label feature selection method. Here, mutual information between features and labels serves as the relevancy metric, while the correlation distance between features, derived from mutual information and joint entropy, is utilized as the redundancy measure. Following aggregation of these metrics on the edge server and employing Pareto-based bi-objective and crowding distance strategies, the sorted features are subsequently sent back to the IoT devices. The proposed method is evaluated through two scenarios: 1) transmitting reduced-size datasets to the edge server for centralized classifier usage, and 2) employing federated learning with reduced-size datasets. Evaluation across three metrics - performance, time complexity, and communication cost - demonstrates that FMLFS outperforms five other comparable methods in the literature and provides a good trade-off on three real-world datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. A. Mahanipour and H. Khamfroush, “Multimodal multiple federated feature construction method for iot environments,” in GLOBECOM 2023-2023 IEEE Global Communications Conference.   IEEE, 2023, pp. 1890–1895.
  2. R. Zebari, A. Abdulazeez, D. Zeebaree, D. Zebari, and J. Saeed, “A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction,” Journal of Applied Science and Technology Trends, vol. 1, no. 2, pp. 56–70, 2020.
  3. L. Hu, L. Gao, Y. Li, and P. Zhang, “Feature-specific mutual information variation for multi-label feature selection,” Information Sciences, vol. 593, pp. 449–471, 2022.
  4. T. Nishio and R. Yonetani, “Client selection for federated learning with heterogeneous resources in mobile edge,” in ICC 2019-2019 IEEE international conference on communications (ICC).   IEEE, 2019, pp. 1–7.
  5. N. Spolaôr, E. A. Cherman, and M. C. Monard, “A comparison of multi-label feature selection methods using the problem transformation approach,” Electronic notes in theoretical computer science, vol. 292, pp. 135–151, 2013.
  6. G. Doquire and M. Verleysen, “Feature selection for multi-label classification problems,” in Advances in Computational Intelligence: 11th International Work-Conference on Artificial Neural Networks, IWANN 2011.   Springer, 2011, pp. 9–16.
  7. O. Reyes, C. Morell, and S. Ventura, “Scalable extensions of the relieff algorithm for weighting and selecting features on the multi-label learning context,” Neurocomputing, vol. 161, pp. 168–182, 2015.
  8. M. R. Boutell, J. Luo, X. Shen, and C. M. Brown, “Learning multi-label scene classification,” Pattern recognition, vol. 37, no. 9, pp. 1757–1771, 2004.
  9. G. Tsoumakas, I. Katakis, and I. Vlahavas, “Random k-labelsets for multilabel classification,” IEEE transactions on knowledge and data engineering, vol. 23, no. 7, pp. 1079–1089, 2010.
  10. J. Read, “A pruned problem transformation method for multi-label classification,” in Proc. 2008 New Zealand Computer Science Research Student Conference (NZCSRS 2008), vol. 143150, 2008, p. 41.
  11. W. Chen, J. Yan, B. Zhang, Z. Chen, and Q. Yang, “Document transformation for multi-label feature selection in text categorization,” in Seventh IEEE International Conference on Data Mining (ICDM 2007).   IEEE, 2007, pp. 451–456.
  12. A. Hashemi, M. B. Dowlatshahi, and H. Nezamabadi-pour, “An efficient pareto-based feature selection algorithm for multi-label classification,” Information Sciences, vol. 581, pp. 428–447, 2021.
  13. J. Lee and D.-W. Kim, “Mutual information-based multi-label feature selection using interaction information,” Expert Systems with Applications, vol. 42, no. 4, 2015.
  14. j. Lee and D.-W. kim, “Feature selection for multi-label classification using multivariate mutual information,” Pattern Recognition Letters, vol. 34, no. 3, pp. 349–357, 2013.
  15. J. Lee and D.-W. Kim, “Scls: Multi-label feature selection based on scalable criterion for large label set,” Pattern Recognition, vol. 66, pp. 342–352, 2017.
  16. A. Mahanipour and H. Khamfroush, “Wrapper-based federated feature selection for iot environments,” in 2023 International Conference on Computing, Networking and Communications (ICNC).   IEEE, 2023, pp. 214–219.
  17. X. Zhang, A. Mavromatics, and A. Vafeas, “Federated feature selection for horizontal federated learning in iot networks,” IEEE Internet of Things Journal, 2023.
  18. A. Li, H. Peng, L. Zhang, J. Huang, Q. Guo, H. Yu, and Y. Liu, “Fedsdg-fs: Efficient and secure feature selection for vertical federated learning,” arXiv preprint arXiv:2302.10417, 2023.
  19. Y. Hu, Y. Zhang, D. Gong, and X. Sun, “Multi-participant federated feature selection algorithm with particle swarm optimizaiton for imbalanced data under privacy protection,” IEEE Transactions on Artificial Intelligence, 2022.
  20. C. Von Lücken, B. Barán, and C. Brizuela, “A survey on multi-objective evolutionary algorithms for many-objective problems,” Computational optimization and applications, vol. 58, pp. 707–756, 2014.
  21. C. R. Raquel and P. C. Naval Jr, “An effective use of crowding distance in multiobjective particle swarm optimization,” in Proceedings of the 7th Annual conference on Genetic and Evolutionary Computation, 2005, pp. 257–264.
  22. C. E. Shannon, “A mathematical theory of communication,” ACM SIGMOBILE mobile computing and communications review, vol. 5, no. 1, pp. 3–55, 2001.
  23. A. N. Tarekegn, M. Giacobini, and K. Michalak, “A review of methods for imbalanced multi-label classification,” Pattern Recognition, vol. 118, p. 107965, 2021.
  24. M.-L. Zhang and Z.-H. Zhou, “Ml-knn: A lazy learning approach to multi-label learning,” Pattern recognition, vol. 40, no. 7, pp. 2038–2048, 2007.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com