Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large Language Models for Blockchain Security: A Systematic Literature Review (2403.14280v4)

Published 21 Mar 2024 in cs.CR
Large Language Models for Blockchain Security: A Systematic Literature Review

Abstract: LLMs have emerged as powerful tools across various domains within cyber security. Notably, recent studies are increasingly exploring LLMs applied to the context of blockchain security (BS). However, there remains a gap in a comprehensive understanding regarding the full scope of applications, impacts, and potential constraints of LLMs on blockchain security. To fill this gap, we undertake a literature review focusing on the studies that apply LLMs in blockchain security (LLM4BS). Our study aims to comprehensively analyze and understand existing research, and elucidate how LLMs contribute to enhancing the security of blockchain systems. Through a thorough examination of existing literature, we delve into the integration of LLMs into various aspects of blockchain security. We explore the mechanisms through which LLMs can bolster blockchain security, including their applications in smart contract auditing, transaction anomaly detection, vulnerability repair, program analysis of smart contracts, and serving as participants in the cryptocurrency community. Furthermore, we assess the challenges and limitations associated with leveraging LLMs for enhancing blockchain security, considering factors such as scalability, privacy concerns, and ethical concerns. Our thorough review sheds light on the opportunities and potential risks of tasks on LLM4BS, providing valuable insights for researchers, practitioners, and policymakers alike.

LLMs for Enhancing Blockchain Security: A Comprehensive Survey

Introduction

The confluence of artificial intelligence, specifically LLMs, with blockchain technology has unveiled a new frontier in enhancing blockchain security. The systematic literature review conducted in this domain marks a significant stride in understanding the current state of research and the implications of deploying LLMs to fortify blockchain systems against an array of cyber threats. This post explores the salient points of the survey, explores the applications of LLMs in blockchain security, discusses inherent challenges and limitations, and speculates on future research directions.

Overview of LLMs in Blockchain Security (LLM4BS)

Applications of LLMs in Blockchain Security

The implementation of LLMs in blockchain security spans several critical areas, including but not limited to:

  • Smart Contract Auditing: LLMs aid in identifying vulnerabilities within smart contracts by understanding code context and logic beyond conventional pattern recognition, offering a nuanced security analysis that traditional tools may overlook.
  • Anomaly Detection in Transactions: Through real-time monitoring and the analysis of transaction data, LLMs provide dynamic capabilities to identify and flag suspicious activities, adapting to new patterns of fraudulent transactions.
  • Fuzzing for Smart Contract Vulnerabilities: Leveraging LLMs to guide fuzzing processes enables a more focused and efficient search for vulnerabilities within smart contracts, significantly enhancing the depth and accuracy of security audits.

These applications demonstrate LLMs' capacity to process complex data sets and understand intricate patterns, making them invaluable tools in addressing the multifaceted aspects of blockchain security.

Addressing Common Blockchain Security Threats

Key components of blockchain security, such as cryptography and consensus algorithms, along with decentralization principles, are strengthened by integrating LLMs. This integration aids in:

  • Mitigating Consensus-Based Attacks: LLMs can identify and address vulnerabilities in consensus mechanisms, offering solutions to prevent 51% attacks and other consensus disruptions.
  • Preventing Smart Contract Exploits: By analyzing smart contract code in depth, LLMs play a critical role in detecting and mitigating nuanced vulnerabilities, thus safeguarding against potential exploits.

Taxonomy of LLM4BS Tasks

The survey categorizes LLM applications into distinct tasks, including code auditing for smart contracts, analyzing abnormal transactions, and assisting in the smart contract development lifecycle. This taxonomy facilitates a structured approach to understanding how LLMs contribute to various facets of blockchain security.

Case Studies

Highlighted case studies such as LLM4Fuzz, SMARTINV, and BLOCKGPT offer insights into practical implementations of LLMs in enhancing blockchain security. These examples underscore the advanced capabilities of LLMs in detecting vulnerabilities, guiding security audits, and ensuring the integrity of blockchain transactions.

Future Directions and Challenges

Future research in LLM4BS is poised to tackle several critical areas, including:

  • Interdisciplinary Research: Encouraging collaboration across the fields of AI, cybersecurity, and blockchain technology to develop comprehensive security solutions.
  • Regulatory Compliance: Navigating the evolving landscape of regulatory requirements and ensuring LLM applications adhere to legal and ethical standards.
  • Adapting to Dynamic Security Threats: Enhancing the adaptability of LLMs to counteract novel and sophisticated cyber threats effectively.
  • Energy and Sustainability: Addressing the environmental impact of deploying energy-intensive LLMs in blockchain security operations.

The roadmap for LLM4BS emphasizes the importance of addressing these challenges to leverage the full potential of LLMs in securing blockchain systems.

Conclusion

The integration of LLMs into blockchain security represents a pivotal advancement in protecting digital infrastructures. While promising, this integration necessitates ongoing research, ethical considerations, and a commitment to innovation and collaboration. By navigating the outlined challenges and harnessing the power of LLMs, the future of blockchain security appears robust, adaptive, and capable of withstanding the complexities of modern cyber threats. The journey of exploring LLM4BS is just beginning, with much ground to cover in understanding, refining, and implementing these models to create secure, efficient, and trustworthy blockchain systems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (97)
  1. Chatgpt for good? on opportunities and challenges of large language models for education, Learning and individual differences 103 (2023) 102274.
  2. Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injection, in: Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security, 2023, pp. 79–90.
  3. Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face, Advances in Neural Information Processing Systems 36 (2024).
  4. A survey on large language model (llm) security and privacy: The good, the bad, and the ugly, High-Confidence Computing (2024) 100211.
  5. A survey on evaluation of large language models, ACM Transactions on Intelligent Systems and Technology (2023).
  6. Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation, Advances in Neural Information Processing Systems 36 (2024).
  7. A brief overview of chatgpt: The history, status quo and potential future development, IEEE/CAA Journal of Automatica Sinica 10 (2023) 1122–1136.
  8. The scope of chatgpt in software engineering: A thorough investigation, arXiv preprint arXiv:2305.12138 (2023).
  9. Large language model-powered smart contract vulnerability detection: New perspectives, arXiv preprint arXiv:2310.01152 (2023).
  10. Summary of chatgpt-related research and perspective towards the future of large language models, Meta-Radiology (2023) 100017.
  11. A survey on the security of blockchain systems, Future generation computer systems 107 (2020) 841–853.
  12. Large language models for software engineering: A systematic literature review, ArXiv abs/2308.10620 (2023). URL: https://api.semanticscholar.org/CorpusID:261048648.
  13. A survey of large language models, ArXiv abs/2303.18223 (2023). URL: https://api.semanticscholar.org/CorpusID:257900969.
  14. Large language models for software engineering: A systematic literature review, arXiv preprint arXiv:2308.10620 (2023).
  15. Llm-planner: Few-shot grounded planning for embodied agents with large language models, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 2998–3009.
  16. Why johnny can’t prompt: how non-ai experts try (and fail) to design llm prompts, in: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 2023, pp. 1–21.
  17. Large language models are few-shot testers: Exploring llm-based general bug reproduction, in: 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), IEEE, 2023, pp. 2312–2323.
  18. J. Howard, S. Ruder, Universal language model fine-tuning for text classification, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 328–339.
  19. Tabert: Pretraining for joint understanding of textual and tabular data, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 8413–8426.
  20. Transformer-xl: Attentive language models beyond a fixed-length context, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, 2019.
  21. Compost: Characterizing and evaluating caricature in llm simulations, in: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023, pp. 10853–10875.
  22. Llm-in-the-loop: Leveraging large language model for thematic analysis, arXiv preprint arXiv:2310.15100 (2023).
  23. Visually-situated natural language understanding with contrastive reading model and frozen large language models, in: The 2023 Conference on Empirical Methods in Natural Language Processing, 2023.
  24. Lever: Learning to verify language-to-code generation with execution, in: International Conference on Machine Learning, PMLR, 2023, pp. 26106–26128.
  25. Llm aided semi-supervision for efficient extractive dialog summarization, in: The 2023 Conference on Empirical Methods in Natural Language Processing, 2023.
  26. Deplot: One-shot visual language reasoning by plot-to-table translation, arXiv preprint arXiv:2212.10505 (2022).
  27. Towards next-generation intelligent assistants leveraging llm techniques, in: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023, pp. 5792–5793.
  28. Q. Gu, Llm-based code generation method for golang compiler testing, in: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2023, pp. 2201–2203.
  29. Expectation vs. experience: Evaluating the usability of code generation tools powered by large language models, in: Chi conference on human factors in computing systems extended abstracts, 2022, pp. 1–7.
  30. Llm-based interaction for content generation: A case study on the perception of employees in an it department, in: Proceedings of the 2023 ACM International Conference on Interactive Media Experiences, 2023, pp. 237–241.
  31. N. Sultanum, A. Srinivasan, Datatales: Investigating the use of large language models for authoring data-driven articles, in: 2023 IEEE Visualization and Visual Analytics (VIS), IEEE, 2023, pp. 231–235.
  32. Hawk: The blockchain model of cryptography and privacy-preserving smart contracts, in: 2016 IEEE symposium on security and privacy (SP), IEEE, 2016, pp. 839–858.
  33. A blockchain-based shamir’s threshold cryptography for data protection in industrial internet of things of smart city, in: Proceedings of the 1st Workshop on Artificial Intelligence and Blockchain Technologies for Smart Cities with 6G, 2021, pp. 13–18.
  34. Untangling blockchain: A data processing view of blockchain systems, IEEE transactions on knowledge and data engineering 30 (2018) 1366–1385.
  35. Performance modeling of pbft consensus process for permissioned blockchain network (hyperledger fabric), in: 2017 IEEE 36th symposium on reliable distributed systems (SRDS), IEEE, 2017, pp. 253–255.
  36. A scalable multi-layer pbft consensus for blockchain, IEEE Transactions on Parallel and Distributed Systems 32 (2020) 1146–1160.
  37. On the security and performance of proof of work blockchains, in: Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, 2016, pp. 3–16.
  38. Proof-of-stake sidechains, in: 2019 IEEE Symposium on Security and Privacy (SP), IEEE, 2019, pp. 139–156.
  39. Securing proof-of-stake blockchain protocols, in: Data Privacy Management, Cryptocurrencies and Blockchain Technology: ESORICS 2017 International Workshops, DPM 2017 and CBT 2017, Oslo, Norway, September 14-15, 2017, Proceedings, Springer, 2017, pp. 297–315.
  40. Peer to peer for privacy and decentralization in the internet of things, in: 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C), IEEE, 2017, pp. 288–290.
  41. Scaling blockchains without giving up decentralization and security: A solution to the blockchain scalability trilemma, in: Proceedings of the 3rd Workshop on Cryptocurrencies and Blockchains for Distributed Systems, 2020, pp. 71–76.
  42. Decentralizing privacy: Using blockchain to protect personal data, in: 2015 IEEE security and privacy workshops, IEEE, 2015, pp. 180–184.
  43. Crowdbc: A blockchain-based decentralized framework for crowdsourcing, IEEE transactions on parallel and distributed systems 30 (2018) 1251–1266.
  44. Smart contract development: Challenges and opportunities, IEEE Transactions on Software Engineering 47 (2019) 2084–2106.
  45. Smart contract engineering, Electronics 9 (2020) 2042.
  46. Sok: Decentralized finance (defi) attacks, in: 2023 IEEE Symposium on Security and Privacy (SP), IEEE, 2023, pp. 2444–2461.
  47. Blockchain security: A survey of techniques and research directions, IEEE Transactions on Services Computing 15 (2020) 2490–2510.
  48. A survey on blockchain for information systems management and security, Information Processing & Management 58 (2021) 102397.
  49. Towards multiple-mix-attack detection via consensus-based trust management in iot networks, Computers & Security 96 (2020) 101898.
  50. Sg-pbft: A secure and highly efficient distributed blockchain pbft consensus algorithm for intelligent internet of vehicles, Journal of Parallel and Distributed Computing 164 (2022) 1–11.
  51. Modeling the impact of network connectivity on consensus security of proof-of-work blockchain, in: IEEE INFOCOM 2020-IEEE Conference on Computer Communications, IEEE, 2020, pp. 1648–1657.
  52. Ethainter: a smart contract security analyzer for composite vulnerabilities, in: Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, 2020, pp. 454–469.
  53. Smart contract security: A practitioners’ perspective, in: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), IEEE, 2021, pp. 1410–1422.
  54. A {{\{{Mixed-Methods}}\}} study of security practices of smart contract developers, in: 32nd USENIX Security Symposium (USENIX Security 23), 2023, pp. 2545–2562.
  55. Smarter smart contract development tools, in: 2019 IEEE/ACM 2nd International Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB), IEEE, 2019, pp. 48–51.
  56. Smart contract and defi security tools: Do they meet the needs of practitioners?, in: Proceedings of the 46th IEEE/ACM International Conference on Software Engineering, 2024, pp. 1–13.
  57. On the just-in-time discovery of profit-generating transactions in defi protocols, in: 2021 IEEE Symposium on Security and Privacy (SP), IEEE, 2021, pp. 919–936.
  58. Impact and user perception of sandwich attacks in the defi ecosystem, in: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, 2022, pp. 1–15.
  59. Defitainter: Detecting price manipulation vulnerabilities in defi protocols, in: Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, 2023, pp. 1144–1156.
  60. As strong as its weakest link: How to break blockchain dapps at rpc service., in: NDSS, 2021.
  61. S. Kim, S. Hwang, Etherdiffer: Differential testing on rpc services of ethereum nodes, in: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2023, pp. 1333–1344.
  62. Deter: Denial of ethereum txpool services, in: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, 2021, pp. 1645–1667.
  63. A survey of blockchain technology on security, privacy, and trust in crowdsourcing services, World Wide Web 23 (2020) 393–419.
  64. Smartinv: Multimodal learning for smart contract invariant inference, in: 2024 IEEE Symposium on Security and Privacy (SP), IEEE Computer Society, 2024, pp. 126–126.
  65. Gptscan: Detecting logic vulnerabilities in smart contracts by combining gpt with program analysis, Proc. IEEE/ACM ICSE (2024).
  66. Do you still need a manual smart contract audit?, arXiv preprint arXiv:2306.12338 (2023).
  67. Who is smarter? an empirical study of ai-based smart contract creation, in: 2023 5th Conference on Blockchain Research & Applications for Innovative Networks and Services (BRAINS), IEEE, 2023, pp. 1–8.
  68. F. Ö. Sönmez, W. J. Knottenbelt, Contractarmor: Attack surface generator for smart contracts, Procedia Computer Science 231 (2024) 8–15.
  69. Identifying and fixing vulnerable patterns in ethereum smart contracts: A comparative study of fine-tuning and prompt engineering using large language models, Available at SSRN 4530467 (????).
  70. Assbert: Active and semi-supervised bert for smart contract vulnerability detection, Journal of Information Security and Applications 73 (2023) 103423.
  71. Pscvfinder: A prompt-tuning based framework for smart contract vulnerability detection, in: 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE), IEEE, 2023, pp. 556–567.
  72. Llm4vuln: A unified evaluation framework for decoupling and enhancing llms’ vulnerability reasoning, arXiv preprint arXiv:2401.16185 (2024).
  73. Blockchain large language models, arXiv preprint arXiv:2304.12749 (2023).
  74. Enhancing illicit activity detection using xai: A multimodal graph-llm framework, arXiv preprint arXiv:2310.13787 (2023).
  75. Llm4fuzz: Guided fuzzing of smart contracts with large language models, arXiv preprint arXiv:2401.11108 (2024).
  76. Acfix: Guiding llms with mined common rbac practices for context-aware repair of access control vulnerabilities in smart contracts, arXiv preprint arXiv:2403.06838 (2024).
  77. Efficient avoidance of vulnerabilities in auto-completed smart contract code using vulnerability-constrained decoding, in: 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE), IEEE, 2023, pp. 683–693.
  78. Optimizing large language models to expedite the development of smart contracts, arXiv preprint arXiv:2310.05178 (2023).
  79. Y. Du, X. Tang, Evaluation of chatgpt’s smart contract auditing capabilities based on chain of thought, arXiv preprint arXiv:2402.12023 (2024).
  80. Gptutor: an open-source ai pair programming tool alternative to copilot, arXiv preprint arXiv:2310.13896 (2023).
  81. Large language models in cryptocurrency securities cases: Can chatgpt replace lawyers?, arXiv preprint arXiv:2308.06032 (2023).
  82. Scaling culture in blockchain gaming: Generative ai and pseudonymous engagement, arXiv preprint arXiv:2312.07693 (2023).
  83. Decentralised governance for foundation model based ai systems: Exploring the role of blockchain in responsible ai, IEEE Software (2024).
  84. Classifying proposals of decentralized autonomous organizations using large language models, arXiv preprint arXiv:2401.07059 (2024).
  85. Teaching machines to code: Smart contract translation with llms, arXiv preprint arXiv:2403.09740 (2024).
  86. S. Wellington, Basedai: A decentralized p2p network for zero knowledge large language models (zk-llms), arXiv preprint arXiv:2403.01008 (2024).
  87. Bc4llm: Trusted artificial intelligence when blockchain meets large language models, arXiv preprint arXiv:2310.06278 (2023).
  88. Learning profitable nft image diffusions via multiple visual-policy guided reinforcement learning, in: Proceedings of the 31st ACM International Conference on Multimedia, 2023, pp. 6831–6840.
  89. Chatgpt and large language model (llm) chatbots: The current state of acceptability and a proposal for guidelines on utilization in academic medicine, Journal of Pediatric Urology (2023).
  90. Chatgpt and large language models in academia: opportunities and challenges, BioData Mining 16 (2023) 20.
  91. Welcome to the era of chatgpt et al. the prospects of large language models, Business & Information Systems Engineering 65 (2023) 95–101.
  92. Can chatgpt replace traditional kbqa models? an in-depth analysis of the question answering performance of the gpt llm family, in: International Semantic Web Conference, Springer, 2023, pp. 348–367.
  93. Ö. Aydin, E. Karaarslan, Is chatgpt leading generative ai? what is beyond expectations?, Academic Platform Journal of Engineering and Smart Systems 11 (2023) 118–134.
  94. P. P. Ray, Chatgpt: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope, Internet of Things and Cyber-Physical Systems (2023).
  95. K. I. Roumeliotis, N. D. Tselikas, Chatgpt and open-ai models: A preliminary review, Future Internet 15 (2023) 192.
  96. Toolllm: Facilitating large language models to master 16000+ real-world apis, arXiv preprint arXiv:2307.16789 (2023).
  97. Dao to hanoi via desci: Ai paradigm shifts from alphago to chatgpt, IEEE/CAA Journal of Automatica Sinica 10 (2023) 877–897.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Zheyuan He (6 papers)
  2. Zihao Li (161 papers)
  3. Sen Yang (191 papers)
  4. Ao Qiao (2 papers)
  5. Xiaosong Zhang (29 papers)
  6. Xiapu Luo (106 papers)
  7. Ting Chen (148 papers)
Citations (10)