Mapping LLM Security Landscapes: A Comprehensive Stakeholder Risk Assessment Proposal (2403.13309v1)
Abstract: The rapid integration of LLMs across diverse sectors has marked a transformative era, showcasing remarkable capabilities in text generation and problem-solving tasks. However, this technological advancement is accompanied by significant risks and vulnerabilities. Despite ongoing security enhancements, attackers persistently exploit these weaknesses, casting doubts on the overall trustworthiness of LLMs. Compounding the issue, organisations are deploying LLM-integrated systems without understanding the severity of potential consequences. Existing studies by OWASP and MITRE offer a general overview of threats and vulnerabilities but lack a method for directly and succinctly analysing the risks for security practitioners, developers, and key decision-makers who are working with this novel technology. To address this gap, we propose a risk assessment process using tools like the OWASP risk rating methodology which is used for traditional systems. We conduct scenario analysis to identify potential threat agents and map the dependent system components against vulnerability factors. Through this analysis, we assess the likelihood of a cyberattack. Subsequently, we conduct a thorough impact analysis to derive a comprehensive threat matrix. We also map threats against three key stakeholder groups: developers engaged in model fine-tuning, application developers utilizing third-party APIs, and end users. The proposed threat matrix provides a holistic evaluation of LLM-related risks, enabling stakeholders to make informed decisions for effective mitigation strategies. Our outlined process serves as an actionable and comprehensive tool for security practitioners, offering insights for resource management and enhancing the overall system security.
- Communications Security Establishment (Canada) Royal Canadian Mounted Police. “Harmonized threat and risk assessment (TRA) methodology .” Communications Security Establishment : Royal Canadian Mounted Police, 2007, 2007 URL: https://publications.gc.ca/site/eng/9.845156/publication.html
- NIST “NIST SP 800-30 Rev. 1 Guide for Conducting Risk Assessments” In NIST NIST, 2012 URL: https://csrc.nist.gov/pubs/sp/800/30/r1/final
- NIST “NIST Risk Management Framework (RMF)” In NIST NIST, 2016 URL: https://csrc.nist.gov/projects/risk-management/about-rmf
- “Deep reinforcement learning from human preferences” In Advances in neural information processing systems 30, 2017
- “Attention is all you need” In Advances in neural information processing systems 30, 2017
- “An architectural risk analysis of machine learning systems: Toward more secure machine learning” In Berryville Institute of Machine Learning, Clarke County, VA. Accessed on: Mar 23, 2020
- Carl Wilhjelm and Awad A Younis “A threat analysis methodology for security requirements elicitation in machine learning based systems” In 2020 IEEE 20th International Conference on Software Quality, Reliability and Security Companion (QRS-C), 2020, pp. 426–433 IEEE
- “Lora: Low-rank adaptation of large language models” In arXiv preprint arXiv:2106.09685, 2021
- “Training a helpful and harmless assistant with reinforcement learning from human feedback” In arXiv preprint arXiv:2204.05862, 2022
- “Modeling threats to AI-ML systems using STRIDE” In Sensors 22.17 MDPI, 2022, pp. 6662
- “Training language models to follow instructions with human feedback” In Advances in Neural Information Processing Systems 35, 2022, pp. 27730–27744
- “Attacks on ML Systems: From Security Analysis to Attack Mitigation” In International Conference on Information Systems Security, 2022, pp. 119–138 Springer
- “Jailbreaker: Automated jailbreak across multiple large language model chatbots” In arXiv preprint arXiv:2307.08715, 2023
- “Bias and fairness in large language models: A survey” In arXiv preprint arXiv:2309.00770, 2023
- “More than you’ve asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models” In arXiv e-prints, 2023, pp. arXiv–2302
- “Catastrophic jailbreak of open-source llms via exploiting generation” In arXiv preprint arXiv:2310.06987, 2023
- “Mistral 7B” In arXiv preprint arXiv:2310.06825, 2023
- “Multi-step jailbreaking privacy attacks on chatgpt” In arXiv preprint arXiv:2304.05197, 2023
- MITRE “ATLAS Machine Learning Threat Matrix” In MITRE MITRE, 2023 URL: https://atlas.mitre.org/matrices/ATLAS/
- OWASP “OWASP Top 10 for Large Language Model Applications” In OWASP OWASP, 2023 URL: https://owasp.org/www-project-top-10-for-large-language-model-applications/
- Siladitya Ray “Samsung bans chatgpt among employees after sensitive code leak” In Forbes Forbes Magazine, 2023 URL: https://www.forbes.com/sites/siladityaray/2023/05/02/samsung-bans-chatgpt-and-other-chatbots-for-employees-after-sensitive-code-leak/?sh=3a06ed686078
- Schiffer “Amazon’s Q has “severe hallucinations” and leaks confidential data in public preview, employees warn” In Platformer Platformer, 2023 URL: https://www.platformer.news/amazons-q-has-severe-hallucinations/
- Ludwig-Ferdinand Stumpp “Achieving Code Execution in MathGPT via Prompt Injection” In MITRE MITRE, 2023 URL: https://atlas.mitre.org/studies/AML.CS0016/
- “Alpaca: A strong, replicable instruction-following model” In Stanford Center for Research on Foundation Models. https://crfm. stanford. edu/2023/03/13/alpaca. html 3.6, 2023, pp. 7
- “Llama 2: Open foundation and fine-tuned chat models” In arXiv preprint arXiv:2307.09288, 2023
- “Fundamental limitations of alignment in large language models” In arXiv preprint arXiv:2304.11082, 2023
- “Instructions as Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language Models” In arXiv preprint arXiv:2305.14710, 2023
- “Siren’s song in the ai ocean: A survey on hallucination in large language models” In arXiv preprint arXiv:2309.01219, 2023
- “Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems” In arXiv preprint arXiv:2401.05778, 2024
- Kapoor and Bommasani al. “On the Societal Impact of Open Foundation Models” In Stanford CRFM Stanford CRFM, 2024 URL: https://crfm.stanford.edu/open-fms/
- ENISA “ENISA Risk Assessment” In ENISA ENISA URL: https://www.enisa.europa.eu/topics/risk-management/current-risk/risk-management-inventory/rm-process/risk-assessment
- ISO “ISO 27001:2022 Information security management systems” In ISO ISO URL: https://www.iso.org/standard/27001
- OWASP “OWASP Risk Rating Methodology” In OWASP OWASP URL: https://owasp.org/www-community/OWASP_Risk_Rating_Methodology
- Rahul Pankajakshan (2 papers)
- Sumitra Biswal (1 paper)
- Yuvaraj Govindarajulu (8 papers)
- Gilad Gressel (5 papers)