Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Revolutionizing Cyber Threat Detection with Large Language Models: A privacy-preserving BERT-based Lightweight Model for IoT/IIoT Devices (2306.14263v2)

Published 25 Jun 2023 in cs.CR and cs.AI

Abstract: The field of NLP is currently undergoing a revolutionary transformation driven by the power of pre-trained LLMs based on groundbreaking Transformer architectures. As the frequency and diversity of cybersecurity attacks continue to rise, the importance of incident detection has significantly increased. IoT devices are expanding rapidly, resulting in a growing need for efficient techniques to autonomously identify network-based attacks in IoT networks with both high precision and minimal computational requirements. This paper presents SecurityBERT, a novel architecture that leverages the Bidirectional Encoder Representations from Transformers (BERT) model for cyber threat detection in IoT networks. During the training of SecurityBERT, we incorporated a novel privacy-preserving encoding technique called Privacy-Preserving Fixed-Length Encoding (PPFLE). We effectively represented network traffic data in a structured format by combining PPFLE with the Byte-level Byte-Pair Encoder (BBPE) Tokenizer. Our research demonstrates that SecurityBERT outperforms traditional Machine Learning (ML) and Deep Learning (DL) methods, such as Convolutional Neural Networks (CNNs) or Recurrent Neural Networks (RNNs), in cyber threat detection. Employing the Edge-IIoTset cybersecurity dataset, our experimental analysis shows that SecurityBERT achieved an impressive 98.2% overall accuracy in identifying fourteen distinct attack types, surpassing previous records set by hybrid solutions such as GAN-Transformer-based architectures and CNN-LSTM models. With an inference time of less than 0.15 seconds on an average CPU and a compact model size of just 16.7MB, SecurityBERT is ideally suited for real-life traffic analysis and a suitable choice for deployment on resource-constrained IoT devices.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. N. Moustafa, N. Koroniotis, M. Keshk, A. Y. Zomaya, and Z. Tari, “Explainable intrusion detection for cyber defences in the internet of things: Opportunities and solutions,” IEEE Communications Surveys & Tutorials, 2023.
  2. S. Silvestri, S. Islam, S. Papastergiou, C. Tzagkarakis, and M. Ciampi, “A machine learning approach for the nlp-based analysis of cyber threats and vulnerabilities of the healthcare ecosystem,” Sensors, vol. 23, no. 2, p. 651, 2023.
  3. C. Thapa, S. I. Jang, M. E. Ahmed, S. Camtepe, J. Pieprzyk, and S. Nepal, “Transformer-based language models for software vulnerability detection,” in Proceedings of the 38th Annual Computer Security Applications Conference, 2022, pp. 481–496.
  4. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  5. A. Rahali and M. A. Akhloufi, “Malbert: Malware detection using bidirectional encoder representations from transformers*,” 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 3226–3231, 2021.
  6. Y. E. Seyyar, A. G. Yavuz, and H. M. Ünver, “An attack detection framework based on bert and deep learning,” IEEE Access, vol. 10, pp. 68 633–68 644, 2022.
  7. S. Chen and H. Liao, “Bert-log: Anomaly detection for system logs based on pre-trained language model,” Applied Artificial Intelligence, vol. 36, 2022.
  8. N. Alkhatib, M. Mushtaq, H. Ghauch, and J.-L. Danger, “Can-bert do it? controller area network intrusion detection system based on bert language model,” in 2022 IEEE/ACS 19th International Conference on Computer Systems and Applications (AICCSA).   IEEE, 2022, pp. 1–8.
  9. E. Aghaei, X. Niu, W. Shadid, and E. Al-Shaer, “Securebert: A domain-specific language model for cybersecurity,” in International Conference on Security and Privacy in Communication Systems.   Springer, 2022, pp. 39–56.
  10. M. A. Ferrag, L. Maglaras, S. Moschoyiannis, and H. Janicke, “Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study,” Journal of Information Security and Applications, vol. 50, p. 102419, 2020.
  11. M. A. Ferrag, O. Friha, D. Hamouda, L. Maglaras, and H. Janicke, “Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications for centralized and federated learning,” IEEE Access, vol. 10, pp. 40 281–40 306, 2022.
  12. T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz et al., “Huggingface’s transformers: State-of-the-art natural language processing,” arXiv preprint arXiv:1910.03771, 2019.
  13. Y. Shibata, T. Kida, S. Fukamachi, M. Takeda, A. Shinohara, T. Shinohara, and S. Arikawa, “Byte pair encoding: A text compression scheme that accelerates pattern matching,” 1999.
  14. A. Araabi, C. Monz, and V. Niculae, “How effective is byte pair encoding for out-of-vocabulary words in neural machine translation?” arXiv preprint arXiv:2208.05225, 2022.
  15. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  16. D. Hamouda, M. A. Ferrag, N. Benhamida, and H. Seridi, “Ppss: A privacy-preserving secure framework using blockchain-enabled federated deep learning for industrial iots,” Pervasive and Mobile Computing, p. 101738, 2022.
  17. O. Friha, M. A. Ferrag, L. Shu, L. Maglaras, K.-K. R. Choo, and M. Nafaa, “Felids: Federated learning-based intrusion detection system for agricultural internet of things,” Journal of Parallel and Distributed Computing, vol. 165, pp. 17–31, 2022.
  18. S. Selvarajan, G. Srivastava, A. O. Khadidos, A. O. Khadidos, M. Baza, A. Alshehri, and J. C.-W. Lin, “An artificial intelligence lightweight blockchain security model for security and privacy in iiot systems,” Journal of Cloud Computing, vol. 12, no. 1, p. 38, 2023.
  19. Y. Chen, Z. Ding, X. Chen, and D. Wagner, “Diversevul: A new vulnerable source code dataset for deep learning based vulnerability detection,” arXiv preprint arXiv:2304.00409, 2023.
  20. M. Douiba, S. Benkirane, A. Guezzaz, and M. Azrour, “An improved anomaly detection model for iot security using decision tree and gradient boosting,” The Journal of Supercomputing, vol. 79, no. 3, pp. 3392–3411, 2023.
  21. H. Jahangir, S. Lakshminarayana, C. Maple, and G. Epiphaniou, “A deep learning-based solution for securing the power grid against load altering threats by iot-enabled devices,” IEEE Internet of Things Journal, 2023.
  22. F. Hu, W. Zhou, K. Liao, H. Li, and D. Tong, “Towards federated learning models resistant to adversarial attacks,” IEEE Internet of Things Journal, 2023.
  23. O. Friha, M. A. Ferrag, M. Benbouzid, T. Berghout, B. Kantarci, and K.-K. R. Choo, “2df-ids: Decentralized and differentially private federated learning-based intrusion detection system for industrial iot,” Computers & Security, p. 103097, 2023.
  24. C. Chakraborty, S. M. Nagarajan, G. G. Devarajan, T. Ramana, and R. Mohanty, “Intelligent ai-based healthcare cyber security system using multi-source transfer learning method,” ACM Transactions on Sensor Networks, 2023.
  25. X. Wang, C. Fang, M. Yang, X. Wu, H. Zhang, and P. Cheng, “Resilient distributed classification learning against label flipping attack: An admm-based approach,” IEEE Internet of Things Journal, 2023.
  26. J. Liu, Y. Tang, H. Zhao, X. Wang, F. Li, and J. Zhang, “Cps attack detection under limited local information in cyber security: An ensemble multi-node multi-class classification approach,” ACM Transactions on Sensor Networks, 2023.
  27. O. Aouedi and K. Piamrat, “F-bids: Federated-blending based intrusion detection system,” Pervasive and Mobile Computing, p. 101750, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Mohamed Amine Ferrag (34 papers)
  2. Mthandazo Ndhlovu (3 papers)
  3. Norbert Tihanyi (18 papers)
  4. Lucas C. Cordeiro (50 papers)
  5. Thierry Lestable (4 papers)
  6. Narinderjit Singh Thandi (2 papers)
  7. Merouane Debbah (269 papers)
Citations (35)

Summary

  • The paper presents a novel SecurityLLM architecture that integrates SecurityBERT for threat detection and FalconLLM for incident response.
  • Experimental evaluation on an IoT dataset shows 98% accuracy in detecting 14 attack types, surpassing traditional ML and deep learning models.
  • Innovative methods like FLLE and a byte-level BBPE tokenizer boost structured data processing, setting a new benchmark in cyber defense.

Revolutionizing Cyber Threat Detection with LLMs

Introduction

The perpetual evolution of cyber threats necessitates innovative approaches in threat detection and incident response systems. The introduction of pre-trained LLMs, including the implementation of the BERT architecture, has marked a significant step forward in the field of cybersecurity. This paper delineates the development and evaluation of SecurityLLM, a pre-trained LLM specifically devised for cyber threat detection and incident response.

The SecurityLLM Model

SecurityLLM amalgamates two pivotal generative components: SecurityBERT, designed for cyber threat detection, and FalconLLM, aimed at incident response and recovery. This combination promises to leverage the synergy between detection and response mechanisms to enhance the overall security posture.

SecurityBERT Model

The cornerstone of the SecurityLLM model, SecurityBERT, leverages the transformer architecture for the detection of cyber threats. By processing cybersecurity-related textual data, SecurityBERT is able to identify a broad spectrum of attacks with remarkable efficiency. Notably, the introduction of the Fixed-Length Language Encoding (FLLE) technique and the Byte-level Byte-Pair Encoder (BBPE) Tokenizer significantly enhances the model’s ability to handle structured network data, fostering a notable improvement in performance.

FalconLLM Model

Building upon the detection capabilities of SecurityBERT, FalconLLM serves as the model’s complementary incident response and recovery system. Trained on a massive corpus of data and boasting 40 billion parameters, FalconLLM’s adeptness in analyzing, interpreting, and suggesting mitigation strategies against identified threats is unparalleled. It extends SecurityLLM’s functionality beyond mere detection, offering actionable insights for threat resolution.

Experimental Evaluation

The experimental evaluation of SecurityLLM utilized an extensive IoT cybersecurity dataset, facilitating the model’s exposure to real-world attack scenarios. Through rigorous testing, SecurityLLM achieved an overall accuracy of 98% in detecting fourteen distinct types of attacks. Furthermore, detailed comparisons with traditional ML methods and deep learning models, such as CNNs and RNNs, underscored SecurityLLM’s superiority in performance, affirming the transformative potential of LLMs in cybersecurity.

Future Directions

The promising results of SecurityLLM signal a fertile ground for further exploration and advancement in the application of LLMs within cybersecurity. Future studies could explore expanding the model’s capabilities to encompass a wider array of attack types and more complex threat scenarios. Additionally, continuous model refinements, updated with the latest threat intelligence data, would ensure its sustained effectiveness.

Conclusion

SecurityLLM represents a novel intersection of LLMs and cybersecurity, offering an effective solution for cyber threat detection and incident response. By harnessing the power of SecurityBERT and FalconLLM, this model sets a new benchmark in the cybersecurity domain, promising enhanced security measures against the ever-evolving landscape of cyber threats. As we look ahead, the potential for further ingenuity and improvements in this space remains vast, signaling a promising trajectory for the utilization of LLMs in safeguarding digital assets.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com