SEvenLLM: Benchmarking, Eliciting, and Enhancing Abilities of Large Language Models in Cyber Threat Intelligence (2405.03446v2)
Abstract: To address the increasing complexity and frequency of cybersecurity incidents emphasized by the recent cybersecurity threat reports with over 10 billion instances, cyber threat intelligence (CTI) plays a critical role in the modern cybersecurity landscape by offering the insights required to understand and combat the constantly evolving nature of cyber threats. Inspired by the powerful capability of LLMs in handling complex tasks, in this paper, we introduce a framework to benchmark, elicit, and improve cybersecurity incident analysis and response abilities in LLMs for Security Events (SEvenLLM). Specifically, we create a high-quality bilingual instruction corpus by crawling cybersecurity raw text from cybersecurity websites to overcome the lack of effective data for information extraction. Then, we design a pipeline to auto-select tasks from the tasks pool and convert the raw text into supervised corpora comprised of question and response. The instruction dataset SEvenLLM-Instruct is used to train cybersecurity LLMs with the multi-task learning objective (27 well-designed tasks) for augmenting the analysis of cybersecurity events. Extensive experiments in our curated benchmark (SEvenLLM-bench) demonstrate that SEvenLLM performs more sophisticated threat analysis and fortifies defenses against the evolving landscape of cyber threats.
- Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
- Shahid Alam. 2022. Cybersecurity: Past, present and future. arXiv preprint arXiv:2207.01227.
- Nlp-based techniques for cyber threat intelligence. arXiv preprint arXiv:2311.08807.
- Dos and don’ts of machine learning in computer security. In 31st USENIX Security Symposium, USENIX Security 2022, Boston, MA, USA, August 10-12, 2022, pages 3971–3988. USENIX Association.
- Qwen technical report. arXiv preprint arXiv:2309.16609.
- A cybersecurity risk analysis framework for systems with artificial intelligence components. arXiv preprint arXiv:2401.01630.
- xcot: Cross-lingual instruction tuning for cross-lingual chain-of-thought reasoning. arXiv preprint arXiv:2401.07037.
- Translog: A unified transformer-based framework for log anomaly detection. arXiv preprint arXiv:2201.00016.
- Owl: A large language model for it operations. arXiv preprint arXiv:2309.09298.
- Lemur: Log parsing with entropy sampling and chain-of-thought merging. arXiv preprint arXiv:2402.18205.
- Zero-shot chain-of-thought reasoning guided by evolutionary algorithms in large language models. arXiv preprint arXiv:2402.05376.
- Jiehui Liu and Jieyu Zhan. 2023. Constructing knowledge graph from cyber threat intelligence using large language model. In IEEE International Conference on Big Data, BigData 2023, Sorrento, Italy, December 15-18, 2023, pages 516–521. IEEE.
- Starcoder 2 and the stack v2: The next generation. arXiv preprint arXiv:2402.19173.
- MCL-NER: cross-lingual named entity recognition via multi-view contrastive learning. In Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2014, February 20-27, 2024, Vancouver, Canada, pages 18789–18797.
- C-icl: Contrastive in-context learning for information extraction. arXiv preprint arXiv:2402.11254.
- Large language models in cybersecurity: State-of-the-art. arXiv preprint arXiv:2402.00891.
- Machine learning based post event analysis for cybersecurity of cyber-physical system. arXiv preprint arXiv:2311.13488.
- Multilingual large language model: A survey of resources, taxonomy and frontiers. arXiv preprint arXiv:2404.04925.
- Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics.
- Code llama: Open foundation models for code. arXiv preprint arXiv:2308.12950.
- Timo Schick and Hinrich Schütze. 2021. It’s not just size that matters: Small language models are also few-shot learners. In NAACL 2021, pages 2339–2352.
- Time for action: Automated analysis of cyber threat intelligence in the wild. arXiv preprint arXiv:2307.10214.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- Self-instruct: Aligning language models with self-generated instructions. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, pages 13484–13508.
- Chain-of-thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022.
- Darwin series: Domain specific large language models for natural science. arXiv preprint arXiv:2308.13565.
- CROP: zero-shot cross-lingual named entity recognition with multilingual labeled sequence translation. In Findings of EMNLP 2022, pages 486–496.
- Ganlm: Encoder-decoder pre-training with an auxiliary discriminator. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, pages 9394–9412.
- Multilingual machine translation systems from microsoft for WMT21 shared task. In Proceedings of the Sixth Conference on Machine Translation, WMT@EMNLP 2021, Online Event, November 10-11, 2021, pages 446–455.
- Improving neural machine translation with soft template prediction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, pages 5979–5989. Association for Computational Linguistics.
- Alternating language modeling for cross-lingual pre-training. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, pages 9386–9393.
- Cyber threat intelligence modeling based on heterogeneous graph convolutional network. In 23rd International Symposium on Research in Attacks, Intrusions and Defenses, RAID 2020, San Sebastian, Spain, October 14-15, 2020, pages 241–256. USENIX Association.
- Hangyuan Ji (4 papers)
- Jian Yang (503 papers)
- Linzheng Chai (16 papers)
- Chaoren Wei (4 papers)
- Liqun Yang (18 papers)
- Yunlong Duan (2 papers)
- Yunli Wang (13 papers)
- Tianzhen Sun (2 papers)
- Hongcheng Guo (39 papers)
- Tongliang Li (18 papers)
- Changyu Ren (21 papers)
- Zhoujun Li (122 papers)