Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Machine Learning-Based Intrusion Detection: Feature Selection versus Feature Extraction (2307.01570v1)

Published 4 Jul 2023 in cs.CR and cs.AI

Abstract: Internet of things (IoT) has been playing an important role in many sectors, such as smart cities, smart agriculture, smart healthcare, and smart manufacturing. However, IoT devices are highly vulnerable to cyber-attacks, which may result in security breaches and data leakages. To effectively prevent these attacks, a variety of machine learning-based network intrusion detection methods for IoT networks have been developed, which often rely on either feature extraction or feature selection techniques for reducing the dimension of input data before being fed into machine learning models. This aims to make the detection complexity low enough for real-time operations, which is particularly vital in any intrusion detection systems. This paper provides a comprehensive comparison between these two feature reduction methods of intrusion detection in terms of various performance metrics, namely, precision rate, recall rate, detection accuracy, as well as runtime complexity, in the presence of the modern UNSW-NB15 dataset as well as both binary and multiclass classification. For example, in general, the feature selection method not only provides better detection performance but also lower training and inference time compared to its feature extraction counterpart, especially when the number of reduced features K increases. However, the feature extraction method is much more reliable than its selection counterpart, particularly when K is very small, such as K = 4. Additionally, feature extraction is less sensitive to changing the number of reduced features K than feature selection, and this holds true for both binary and multiclass classifications. Based on this comparison, we provide a useful guideline for selecting a suitable intrusion detection type for each specific scenario, as detailed in Tab. 14 at the end of Section IV.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari, and M. Ayyash, “Internet of things: A survey on enabling technologies, protocols, and applications,” IEEE Commun. Surveys Tuts., vol. 17, no. 4, pp. 2347–2376, 2015.
  2. N. Chaabouni, M. Mosbah, A. Zemmari, C. Sauvignac, and P. Faruki, “Network intrusion detection for IoT security based on learning techniques,” IEEE Commun. Surveys Tuts., vol. 21, no. 3, pp. 2671–2701, 2019.
  3. P. Kumar, R. Kumar, S. Garg, K. Kaur, Y. Zhang, and M. Guizani, “A secure data dissemination scheme for iot-based e-health systems using ai and blockchain,” in GLOBECOM 2022-2022 IEEE Global Communications Conference.   IEEE, 2022, pp. 1397–1403.
  4. P. Mishra, V. Varadharajan, U. Tupakula, and E. S. Pilli, “A detailed investigation and analysis of using machine learning techniques for intrusion detection,” IEEE Commun. Surveys Tuts., vol. 21, no. 1, pp. 686–728, 2019.
  5. R. Kumar, P. Kumar, M. Aloqaily, and A. Aljuhani, “Deep learning-based blockchain for secure zero touch networks,” IEEE Communications Magazine, 2022.
  6. P. Kumar, R. Kumar, G. P. Gupta, R. Tripathi, A. Jolfaei, and A. N. Islam, “A blockchain-orchestrated deep learning approach for secure data transmission in iot-enabled healthcare system,” Journal of Parallel and Distributed Computing, vol. 172, pp. 69–83, 2023.
  7. G. D’Angelo, F. Palmieri, A. Robustelli, and A. Castiglione, “Effective classification of android malware families through dynamic features and neural networks,” Connection Science, vol. 33, no. 3, pp. 786–801, Jul. 2021.
  8. M. A. Ambusaidi, X. He, P. Nanda, and Z. Tan, “Building an intrusion detection system using a filter-based feature selection algorithm,” IEEE Trans. Comput., vol. 65, no. 10, pp. 2986–2998, 2016.
  9. “KDD Cup 1999 data,” Accessed: 2022-10-10. [Online]. Available: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
  10. M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed analysis of the KDD CUP 99 data set,” in IEEE Symposium on Computational Intelligence for Security and Defense Applications, 2009, pp. 1–6.
  11. J. Song, H. Takakura, Y. Okabe, M. Eto, D. Inoue, and K. Nakao, “Statistical analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS evaluation.”   New York, NY, USA: Association for Computing Machinery, 2011, pp. 29–36.
  12. F. Amiri, M. R. Yousefi, C. Lucas, A. Shakery, and N. Yazdani, “Mutual information-based feature selection for intrusion detection systems,” J. Netw. Comput. Appl., vol. 34, no. 4, pp. 1184–1199, 2011.
  13. C. Khammassi and S. Krichen, “A GA-LR wrapper approach for feature selection in network intrusion detection,” Computers & Security, vol. 70, pp. 255–277, 2017.
  14. B. M. Aslahi-Shahri, R. Rahmani, M. Chizari, A. Maralani, M. Eslami, M. J. Golkar, and A. Ebrahimi, “A hybrid method consisting of GA and SVM for intrusion detection system,” Neural Comput. Appl., vol. 27, no. 6, pp. 1669–1676, Aug. 2016.
  15. N. Moustafa and J. Slay, “UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set),” in 2015 Military Communications and Information Systems Conference (MilCIS), 2015, pp. 1–6.
  16. N. Moustafa and J. Slay, “A hybrid feature selection for network intrusion detection systems: Central points,” arXiv e-prints, p. arXiv:1707.05505, Jul. 2017.
  17. B. A. Tama, M. Comuzzi, and K.-H. Rhee, “TSE-IDS: A two-stage classifier ensemble for intelligent anomaly-based intrusion detection system,” IEEE Access, vol. 7, pp. 94 497–94 507, 2019.
  18. H. Alazzam, A. Sharieh, and K. E. Sabri, “A feature selection algorithm for intrusion detection system based on Pigeon inspired optimizer,” Expert Systems with Applications, vol. 148, p. 113249, 2020.
  19. N. Moustafa and J. Slay, “The evaluation of network anomaly detection systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set,” Inf. Sec. J.: A Global Perspective, vol. 25, no. 1-3, pp. 18–31, Apr. 2016.
  20. N. Moustafa, B. Turnbull, and K.-K. R. Choo, “An ensemble intrusion detection technique based on proposed statistical flow features for protecting network traffic of internet of things,” IEEE Internet Things J., vol. 6, no. 3, pp. 4815–4830, 2019.
  21. F. Saberi-Movahed, M. Rostami, K. Berahmand, S. Karami, P. Tiwari, M. Oussalah, and S. S. Band, “Dual regularized unsupervised feature selection based on matrix factorization and minimum redundancy with application in gene selection,” Knowledge-Based Systems, vol. 256, p. 109884, 2022.
  22. S. Azadifar, M. Rostami, K. Berahmand, P. Moradi, and M. Oussalah, “Graph-based relevancy-redundancy gene selection method for cancer diagnosis,” Computers in Biology and Medicine, vol. 147, p. 105766, 2022.
  23. X. Xu and X. Wang, “An adaptive network intrusion detection method based on PCA and support vector machines,” in Proceedings of the First International Conference on Advanced Data Mining and Applications, 2005, pp. 696–703.
  24. G. Liu, Z. Yi, and S. Yang, “A hierarchical intrusion detection model based on the PCA neural networks,” Neurocomputing, vol. 70, no. 7-9, pp. 1561–1568, 2007.
  25. F. Kuang, W. Xu, and S. Zhang, “A novel hybrid KPCA and SVM with GA model for intrusion detection,” Applied Soft Computing, vol. 18, pp. 178–184, 2014.
  26. I. Sharafaldin., A. Habibi Lashkari., and A. A. Ghorbani., “Toward generating a new intrusion detection dataset and intrusion traffic characterization,” in Proceedings of the 4th International Conference on Information Systems Security and Privacy - ICISSP, 2018, pp. 108–116.
  27. R. Abdulhammed, M. Faezipour, H. Musafer, and A. Abuzneid, “Efficient network intrusion detection using PCA-based dimensionality reduction of features,” in 2019 International Symposium on Networks, Computers and Communications (ISNCC), 2019, pp. 1–6.
  28. L. Qi, Y. Yang, X. Zhou, W. Rafique, and J. Ma, “Fast anomaly identification based on multiaspect data streams for intelligent intrusion detection toward secure industry 4.0,” IEEE Trans. Ind. Informat., vol. 18, no. 9, pp. 6503–6511, 2022.
  29. Z. Tan, A. Jamdagni, X. He, and P. Nanda, “Network intrusion detection based on LDA for payload feature selection,” in 2010 IEEE Globecom Workshops, 2010, pp. 1545–1549.
  30. H. H. Pajouh, R. Javidan, R. Khayami, A. Dehghantanha, and K.-K. R. Choo, “A two-layer dimension reduction and two-tier classification model for anomaly-based intrusion detection in IoT backbone networks,” IEEE Trans. Emerg. Topics Comput., vol. 7, no. 2, pp. 314–323, 2019.
  31. H. H. Pajouh, G. Dastghaibyfard, and S. Hashemi, “Two-tier network anomaly detection model: A machine learning approach,” J. Intell. Inf. Syst., vol. 48, no. 1, pp. 61–74, Feb. 2017.
  32. B. Yan and G. Han, “Effective feature extraction via stacked sparse autoencoder to improve intrusion detection system,” IEEE Access, vol. 6, pp. 41 238–41 248, 2018.
  33. F. A. Khan, A. Gumaei, A. Derhab, and A. Hussain, “A novel two-stage deep learning model for efficient network intrusion detection,” IEEE Access, vol. 7, pp. 30 373–30 385, 2019.
  34. S. I. Popoola, B. Adebisi, M. Hammoudeh, G. Gui, and H. Gacanin, “Hybrid deep learning for botnet attack detection in the internet-of-things networks,” IEEE Internet Things J., vol. 8, no. 6, pp. 4944–4956, 2021.
  35. X. Zhou, Y. Hu, W. Liang, J. Ma, and Q. Jin, “Variational LSTM enhanced anomaly detection for industrial big data,” IEEE Trans. Ind. Informat., vol. 17, no. 5, pp. 3469–3477, 2021.
  36. T.-N. Dao and H. Lee, “Stacked autoencoder-based probabilistic feature extraction for on-device network intrusion detection,” IEEE Internet Things J., vol. 9, no. 16, pp. 14 438–14 451, 2022.
  37. G. D’Angelo and F. Palmieri, “Network traffic classification using deep convolutional recurrent autoencoder neural networks for spatial–temporal features extraction,” Journal of Network and Computer Applications, vol. 173, p. 102890, 2021.
  38. M. A. Hall, “Correlation-based feature selection for machine learning,” Ph.D. dissertation, The University of Waikato, 1999.
  39. S. B. Kotsiantis and et al., “Data preprocessing for supervised learning,” 2006.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Vu-Duc Ngo (5 papers)
  2. Tuan-Cuong Vuong (3 papers)
  3. Thien Van Luong (12 papers)
  4. Hung Tran (48 papers)
Citations (16)

Summary

We haven't generated a summary for this paper yet.