Papers
Topics
Authors
Recent
2000 character limit reached

KnowPhish: Large Language Models Meet Multimodal Knowledge Graphs for Enhancing Reference-Based Phishing Detection

Published 4 Mar 2024 in cs.CR, cs.AI, cs.CL, and cs.LG | (2403.02253v2)

Abstract: Phishing attacks have inflicted substantial losses on individuals and businesses alike, necessitating the development of robust and efficient automated phishing detection approaches. Reference-based phishing detectors (RBPDs), which compare the logos on a target webpage to a known set of logos, have emerged as the state-of-the-art approach. However, a major limitation of existing RBPDs is that they rely on a manually constructed brand knowledge base, making it infeasible to scale to a large number of brands, which results in false negative errors due to the insufficient brand coverage of the knowledge base. To address this issue, we propose an automated knowledge collection pipeline, using which we collect a large-scale multimodal brand knowledge base, KnowPhish, containing 20k brands with rich information about each brand. KnowPhish can be used to boost the performance of existing RBPDs in a plug-and-play manner. A second limitation of existing RBPDs is that they solely rely on the image modality, ignoring useful textual information present in the webpage HTML. To utilize this textual information, we propose a LLM-based approach to extract brand information of webpages from text. Our resulting multimodal phishing detection approach, KnowPhish Detector (KPD), can detect phishing webpages with or without logos. We evaluate KnowPhish and KPD on a manually validated dataset, and a field study under Singapore's local context, showing substantial improvements in effectiveness and efficiency compared to state-of-the-art baselines.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. Visualphishnet: Zero-day phishing website detection by visual similarity. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, CCS ’20, page 1681–1698, New York, NY, USA, 2020. Association for Computing Machinery.
  2. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
  3. Phishzoo: Detecting phishing websites by looking at them. In 2011 IEEE Fifth International Conference on Semantic Computing, pages 368–375, 2011.
  4. Alexa ranking. https://www.alexa.com/siteinfo.
  5. Global Anti-Scam Alliance. The global state of scams report, 2023.
  6. Anti-phishing working group. https://apwg.org/.
  7. Large language model lateral spear phishing: A comparative study in large-scale organizational settings. arXiv preprint arXiv:2401.09727, 2024.
  8. Kl-divergence guided temperature sampling, 2023.
  9. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374, 2021.
  10. Psychological Insights and Perspectives. Sage Publications, Inc, 2005.
  11. Cisco. Cybersecurity threat trends report, 2022.
  12. Unsupervised cross lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020.
  13. A keyword-based combination approach for detecting phishing webpages. Comput. Secur., 84(C):256–275, jul 2019.
  14. D. Divakaran and A. Oest. Phishing detection leveraging machine learning and deep learning: A review. IEEE Security & Privacy, 20(5):2–11, jun 5555.
  15. Free whois lookup. https://www.whois.com/whois/.
  16. Detecting phishing web pages with visual similarity assessment based on earth mover’s distance (emd). IEEE Transactions on Dependable and Secure Computing, 3(4):301–311, 2006.
  17. Google images search. https://pypi.org/project/Google-Images-Search/.
  18. Hinphish: An effective phishing detection approach based on heterogeneous information networks. Applied Sciences, 11(20), 2021.
  19. Julian Hazell. Large language models can be used to effectively scale spear phishing campaigns. arXiv preprint arXiv:2305.06972, 2023.
  20. Harnessing explanations: Llm-to-lm interpreter for enhanced text-attributed graph representation learning, 2023.
  21. A survey of knowledge enhanced pre-trained language models. IEEE Transactions on Knowledge and Data Engineering, 2023.
  22. Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1––38, 2022.
  23. Large language models struggle to learn long-tail knowledge. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 15696–15707. PMLR, 23–29 Jul 2023.
  24. Rabimba Karanjai. Targeted phishing campaigns using large scale language models. arXiv preprint arXiv:2301.00665, 2022.
  25. Detecting phishing sites using chatgpt, 2023.
  26. Internet-augmented dialogue generation. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio, editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8460–8478, Dublin, Ireland, May 2022. Association for Computational Linguistics.
  27. Certifying llm safety against adversarial prompting, 2023.
  28. Urlnet: Learning a url representation with deep learning for malicious url detection, 2018.
  29. D-fence: A flexible, efficient, and comprehensive phishing email detection system. In 2021 IEEE European Symposium on Security and Privacy (EuroS&P), pages 578–597, 2021.
  30. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33:9459–9474, 2020.
  31. A stacking model using url and html features for phishing webpage detection. Future Gener. Comput. Syst., 94(C):27–39, may 2019.
  32. Phishpedia: A hybrid deep learning based approach to visually identify phishing webpages. In 30th USENIX Security Symposium (USENIX Security 21), pages 3793–3810. USENIX Association, August 2021.
  33. Inferring phishing intention via webpage appearance and dynamics: A deep vision based approach. In 31st USENIX Security Symposium (USENIX Security 22), pages 1633–1650, Boston, MA, August 2022. USENIX Association.
  34. Knowledge expansion and counterfactual interaction for Reference-Based phishing detection. In 32nd USENIX Security Symposium (USENIX Security 23), pages 4139–4156, Anaheim, CA, August 2023. USENIX Association.
  35. On the effectiveness of techniques to detect phishing sites. In Bernhard M. Hämmerli and Robin Sommer, editors, Detection of Intrusions and Malware, and Vulnerability Assessment, pages 20–39, Berlin, Heidelberg, 2007. Springer Berlin Heidelberg.
  36. When not to trust language models: Investigating effectiveness of parametric and non-parametric memories. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 9802–9822, Toronto, Canada, July 2023. Association for Computational Linguistics.
  37. Urltran: Improving phishing url detection using transformers. In MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM), pages 197–204, 2021.
  38. Webgpt: Browser-assisted question-answering with human feedback, 2022.
  39. Chatgpt versus traditional question answering for knowledge graphs: Current status and future directions towards knowledge graph chatbots, 2023.
  40. Openphish - phishing intelligence. https://openphish.com/.
  41. Unifying large language models and knowledge graphs: A roadmap. IEEE Transactions on Knowledge and Data Engineering, 2024.
  42. Phishtank | join the fight against phishing. https://phishtank.org/.
  43. Comparing dbpedia, wikidata, and yago for web information retrieval. In Vincenzo Piuri, Valentina Emilia Balas, Samarjeet Borah, and Sharifah Sakinah Syed Ahmad, editors, Intelligent and Interactive Computing, pages 525–535, Singapore, 2019. Springer Singapore.
  44. Tranco: A research-oriented top sites ranking hardened against manipulation. In Proceedings 2019 Network and Distributed System Security Symposium. Internet Society, 2019.
  45. The ghost in the browser: Analysis of web-based malware. In First Workshop on Hot Topics in Understanding Botnets (HotBots 07), Cambridge, MA, April 2007. USENIX Association.
  46. One knowledge graph to rule them all? analyzing the differences between dbpedia, yago, wikidata & co. In Gabriele Kern-Isberner, Johannes Fürnkranz, and Matthias Thimm, editors, KI 2017: Advances in Artificial Intelligence, pages 366–372, Cham, 2017. Springer International Publishing.
  47. From chatbots to phishbots?–preventing phishing scams created using chatgpt, google bard and claude. arXiv preprint arXiv:2310.19181, 2023.
  48. Toolformer: Language models can teach themselves to use tools, 2023.
  49. Trusting your evidence: Hallucinate less with context-aware decoding, 2023.
  50. Head-to-tail: How knowledgeable are large language models (llm)? a.k.a. will llms replace knowledge graphs?, 2023.
  51. Sticking to the facts: Confident decoding for faithful data-to-text generation., 2019.
  52. A comprehensive survey of hallucination mitigation techniques in large language models. arXiv preprint arXiv:2401.01313, 2024.
  53. Url and website scanner - urlscan.io. https://urlscan.io/.
  54. On the character of phishing urls: Accurate and robust statistical learning classifiers. In Proceedings of the 5th ACM Conference on Data and Application Security and Privacy, CODASPY ’15, page 111–122, New York, NY, USA, 2015. Association for Computing Machinery.
  55. Wikidata: A free collaborative knowledgebase. Commun. ACM, 57(10):78–85, sep 2014.
  56. Knowledgpt: Enhancing large language models with retrieval and storage access on knowledge bases, 2023.
  57. Self-consistency improves chain of thought reasoning in language models. In Proceedings of the 11th International Conference on Learning Representations (ICLR)., 2023.
  58. Chain of thought prompting elicits reasoning in large language models. In Conference on Neural Information Processing Systems (NeurIPS), 2022.
  59. What does chatgpt know about phishing? https://securelist.com/chatgpt-anti-phishing/109590/.
  60. Wipo global brand database. https://www.wipo.int/portal/en/index.html.
  61. Cantina+: A feature-rich machine learning framework for detecting phishing web sites. ACM Trans. Inf. Syst. Secur., 14(2), sep 2011.
  62. Decomposition enhances reasoning via self-evaluation guided decoding. In CoRR, 2023.
  63. A survey on multimodal large language models. arXiv preprint arXiv:2306.13549, 2023.
  64. Phishing or not phishing? a survey on the detection of phishing websites. IEEE Access, 11:18499–18519, 2023.
  65. Zscaler. Zscaler threatlabz 2023 phishing report, 2023.
Citations (7)

Summary

  • The paper presents an automated multimodal approach integrating LLMs and large-scale brand knowledge graphs for enhanced phishing detection.
  • It details the KnowPhish detector’s dual analysis of visual and textual data, improving recall, precision, and runtime efficiency.
  • Empirical evaluations demonstrate significant gains over traditional methods, ensuring adaptable and robust detection in evolving threat landscapes.

KnowPhish: Enhancing Phishing Detection through Multimodal Knowledge Graphs

Reference-based phishing detectors (RBPDs) have advanced the state-of-the-art in automated phishing detection. However, existing methods face significant limitations due to their dependence on manually curated brand knowledge bases (BKBs) and image modality-exclusive approaches. The paper "KnowPhish: LLMs Meet Multimodal Knowledge Graphs for Enhancing Reference-Based Phishing Detection" proposes an innovative solution to these issues. KnowPhish is a large-scale multimodal BKB that allows for the integration of comprehensive brand information, significantly expanding the scope and capabilities of RBPDs to detect phishing webpages.

Automated Pipeline for Brand Knowledge Collection

The sophistication of KnowPhish arises from its automated knowledge collection pipeline, which compiles a BKB containing approximately 20,000 brands. This approach is driven by the empirical observation that phishing targets persistently belong to high-value industries, a trend supported by historical data analysis (Figure 1), ensuring stable and predictable target identification over time. KnowPhish taps into resources like Wikidata, leveraging categorical relationships such as the instance_of attribute to populate potential phishing targets within narrow and general categories. Figure 2

Figure 2: An overview of our automated pipeline for constructing our large scale multimodal BKB, KnowPhish. We first collect (a) all brands from certain high-value industries, and (b) only popular brands from general categories. Then, the knowledge acquisition and augmentation steps collect logos, domains, and aliases for these brands.

KnowPhish Detector: A Multimodal Approach

The introduction of the KnowPhish Detector (KPD) enables phishing detection without being limited to the visual domain. KPD uniquely incorporates both visual and textual modalities by deploying a LLM-based approach for extracting brand information from webpage text. This advancement allows for substantial improvements in phishing detection efficiency, addressing cases where phishing websites are devoid of logos (Figure 3). Figure 4

Figure 4: An overview of our phishing detector KPD.

KPD integrates with any existing RBPD through a plug-and-play mechanism, benefiting from comprehensive alias mapping and logo variants available in KnowPhish. The detector's multi-stage analysis employs both visual logo extraction and textual brand inference, significantly enhancing the coverage and precision of phishing detection.

Evaluation of Performance

The empirical evaluations conducted demonstrate the superior performance of KnowPhish and KPD over existing RBPDs. Key performance metrics such as accuracy, F1-score, and recall were substantially improved, with KnowPhish achieving better runtime efficiency due to preemptive knowledge compilation. KPD, in particular, exhibits the highest recall and precision among all tested configurations, with the ability to process a larger number of phishing pages effectively (Figure 5). Figure 5

Figure 5

Figure 5: Top 20 phishing targets detected by KPD+KnowPhish and Phishpedia+DynaPhish on SG-SCAN.

Practical Implications and Future Directions

KnowPhish represents a paradigm shift in phishing detection by incorporating a scalable, multimodal knowledge graph that can be dynamically updated. This ensures the adaptability and robustness of RBPDs in the rapidly evolving landscape of phishing threats. Moving forward, additional integrations with other brand databases and knowledge-augmented LLMs could further enhance KnowPhish's effectiveness and broaden its applicability across different detection systems.

In conclusion, the integration of KnowPhish with existing RBPDs not only catalyzes improvements in detection performance but also addresses limitations inherent in manual and image-only detection methodologies. The multimodal capability facilitated by KnowPhish significantly extends the coverage and precision of phishing detection systems, marking a substantial advancement in cybersecurity practices.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.