LOCALINTEL: Generating Organizational Threat Intelligence from Global and Local Cyber Knowledge
Abstract: Security Operations Center (SoC) analysts gather threat reports from openly accessible global threat repositories and tailor the information to their organization's needs, such as developing threat intelligence and security policies. They also depend on organizational internal repositories, which act as private local knowledge database. These local knowledge databases store credible cyber intelligence, critical operational and infrastructure details. SoCs undertake a manual labor-intensive task of utilizing these global threat repositories and local knowledge databases to create both organization-specific threat intelligence and mitigation policies. Recently, LLMs have shown the capability to process diverse knowledge sources efficiently. We leverage this ability to automate this organization-specific threat intelligence generation. We present LocalIntel, a novel automated threat intelligence contextualization framework that retrieves zero-day vulnerability reports from the global threat repositories and uses its local knowledge database to determine implications and mitigation strategies to alert and assist the SoC analyst. LocalIntel comprises two key phases: knowledge retrieval and contextualization. Quantitative and qualitative assessment has shown effectiveness in generating up to 93% accurate organizational threat intelligence with 64% inter-rater agreement.
- Leveraging BERT’s Power to Classify TTP from Unstructured Text. In 2022 Workshop on Communication Networks and Power Systems (WCNPS), 1–7. IEEE.
- CySecBERT: A Domain-Adapted Language Model for the Cybersecurity Domain. arXiv preprint arXiv:2212.02974.
- Language models are few-shot learners. Advances in neural information processing systems, 33: 1877–1901.
- A survey on evaluation of large language models. arXiv preprint arXiv:2307.03109.
- Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311.
- RAGAS: Automated Evaluation of Retrieval Augmented Generation. arXiv preprint arXiv:2309.15217.
- LogQA: Question Answering in Unstructured Logs. arXiv preprint arXiv:2303.11715.
- Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12): 1–38.
- Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33: 9459–9474.
- TriCTI: an actionable cyber threat intelligence discovery system via trigger-enhanced neural network. Cybersecurity, 5(1): 8.
- Combating fake cyber threat intelligence using provenance in cybersecurity knowledge graphs. In 2021 IEEE International Conference on Big Data (Big Data), 3316–3323. IEEE.
- Cybertwitter: Using twitter to generate alerts for cybersecurity threats and vulnerabilities. In 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 860–867. IEEE.
- Impacts and Risk of Generative AI Technology on Cyber Defense. arXiv preprint arXiv:2306.13033.
- A natural language processing based trend analysis of advanced persistent threat techniques. In 2018 IEEE International Conference on Big Data (Big Data), 2995–3000. IEEE.
- Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 311–318.
- Language models as knowledge bases? arXiv preprint arXiv:1909.01066.
- Relext: Relation extraction using deep learning approaches for cybersecurity knowledge graph improvement. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 879–886.
- Creating cybersecurity knowledge graphs from malware after action reports. IEEE Access, 8: 211691–211703.
- A literature review on mining cyberthreat intelligence from unstructured texts. In 2020 International Conference on Data Mining Workshops (ICDMW), 516–525. IEEE.
- Cybert: Contextualized embeddings for the cybersecurity domain. In 2021 IEEE International Conference on Big Data (Big Data), 3334–3342. IEEE.
- ROUGE, L. C. 2004. A package for automatic evaluation of summaries. In Proceedings of Workshop on Text Summarization of ACL, Spain, volume 5.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- Attention is all you need. Advances in neural information processing systems, 30.
- React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.