Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LOCALINTEL: Generating Organizational Threat Intelligence from Global and Local Cyber Knowledge (2401.10036v1)

Published 18 Jan 2024 in cs.CR, cs.AI, cs.IR, and cs.LO

Abstract: Security Operations Center (SoC) analysts gather threat reports from openly accessible global threat databases and customize them manually to suit a particular organization's needs. These analysts also depend on internal repositories, which act as private local knowledge database for an organization. Credible cyber intelligence, critical operational details, and relevant organizational information are all stored in these local knowledge databases. Analysts undertake a labor intensive task utilizing these global and local knowledge databases to manually create organization's unique threat response and mitigation strategies. Recently, LLMs have shown the capability to efficiently process large diverse knowledge sources. We leverage this ability to process global and local knowledge databases to automate the generation of organization-specific threat intelligence. In this work, we present LOCALINTEL, a novel automated knowledge contextualization system that, upon prompting, retrieves threat reports from the global threat repositories and uses its local knowledge database to contextualize them for a specific organization. LOCALINTEL comprises of three key phases: global threat intelligence retrieval, local knowledge retrieval, and contextualized completion generation. The former retrieves intelligence from global threat repositories, while the second retrieves pertinent knowledge from the local knowledge database. Finally, the fusion of these knowledge sources is orchestrated through a generator to produce a contextualized completion.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. Leveraging BERT’s Power to Classify TTP from Unstructured Text. In 2022 Workshop on Communication Networks and Power Systems (WCNPS), 1–7. IEEE.
  2. CySecBERT: A Domain-Adapted Language Model for the Cybersecurity Domain. arXiv preprint arXiv:2212.02974.
  3. Language models are few-shot learners. Advances in neural information processing systems, 33: 1877–1901.
  4. A survey on evaluation of large language models. arXiv preprint arXiv:2307.03109.
  5. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311.
  6. RAGAS: Automated Evaluation of Retrieval Augmented Generation. arXiv preprint arXiv:2309.15217.
  7. LogQA: Question Answering in Unstructured Logs. arXiv preprint arXiv:2303.11715.
  8. Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12): 1–38.
  9. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33: 9459–9474.
  10. TriCTI: an actionable cyber threat intelligence discovery system via trigger-enhanced neural network. Cybersecurity, 5(1): 8.
  11. Combating fake cyber threat intelligence using provenance in cybersecurity knowledge graphs. In 2021 IEEE International Conference on Big Data (Big Data), 3316–3323. IEEE.
  12. Cybertwitter: Using twitter to generate alerts for cybersecurity threats and vulnerabilities. In 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 860–867. IEEE.
  13. Impacts and Risk of Generative AI Technology on Cyber Defense. arXiv preprint arXiv:2306.13033.
  14. A natural language processing based trend analysis of advanced persistent threat techniques. In 2018 IEEE International Conference on Big Data (Big Data), 2995–3000. IEEE.
  15. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 311–318.
  16. Language models as knowledge bases? arXiv preprint arXiv:1909.01066.
  17. Relext: Relation extraction using deep learning approaches for cybersecurity knowledge graph improvement. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 879–886.
  18. Creating cybersecurity knowledge graphs from malware after action reports. IEEE Access, 8: 211691–211703.
  19. A literature review on mining cyberthreat intelligence from unstructured texts. In 2020 International Conference on Data Mining Workshops (ICDMW), 516–525. IEEE.
  20. Cybert: Contextualized embeddings for the cybersecurity domain. In 2021 IEEE International Conference on Big Data (Big Data), 3334–3342. IEEE.
  21. ROUGE, L. C. 2004. A package for automatic evaluation of summaries. In Proceedings of Workshop on Text Summarization of ACL, Spain, volume 5.
  22. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  23. Attention is all you need. Advances in neural information processing systems, 30.
  24. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629.
Citations (13)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets