Improving Smart Contract Security with Contrastive Learning-based Vulnerability Detection (2404.17839v1)
Abstract: Currently, smart contract vulnerabilities (SCVs) have emerged as a major factor threatening the transaction security of blockchain. Existing state-of-the-art methods rely on deep learning to mitigate this threat. They treat each input contract as an independent entity and feed it into a deep learning model to learn vulnerability patterns by fitting vulnerability labels. It is a pity that they disregard the correlation between contracts, failing to consider the commonalities between contracts of the same type and the differences among contracts of different types. As a result, the performance of these methods falls short of the desired level. To tackle this problem, we propose a novel Contrastive Learning Enhanced Automated Recognition Approach for Smart Contract Vulnerabilities, named Clear. In particular, Clear employs a contrastive learning (CL) model to capture the fine-grained correlation information among contracts and generates correlation labels based on the relationships between contracts to guide the training process of the CL model. Finally, it combines the correlation and the semantic information of the contract to detect SCVs. Through an empirical evaluation of a large-scale real-world dataset of over 40K smart contracts and compare 13 state-of-the-art baseline methods. We show that Clear achieves (1) optimal performance over all baseline methods; (2) 9.73%-39.99% higher F1-score than existing deep learning methods.
- A blockchain consensus mechanism that uses Proof of Solution to optimize energy dispatch and trading. Nature Energy 7, 6 (2022), 495–502.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597–1607.
- Xinlei Chen and Kaiming He. 2021. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 15750–15758.
- Smartian: Enhancing smart contract fuzzing with static and dynamic data-flow analyses. In 36th IEEE/ACM International Conference on Automated Software Engineering. IEEE, 227–239.
- Debiased contrastive learning. Advances in neural information processing systems 33 (2020), 8765–8775.
- Slither: a static analysis framework for smart contracts. In IEEE/ACM 2nd International Workshop on Emerging Trends in Software Engineering for Blockchain. IEEE, 8–15.
- Codebert: A pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155 (2020).
- Michael Fu and Chakkrit Tantithamthavorn. 2022. Linevul: A transformer-based line-level vulnerability prediction. In Proceedings of the 19th International Conference on Mining Software Repositories. 608–620.
- Simcse: Simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821 (2021).
- Madmax: Surviving out-of-gas conditions in ethereum smart contracts. Proceedings of the ACM on Programming Languages 2 (2018), 1–27.
- Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9729–9738.
- Andreas Ibing and Alexandra Mai. 2015. A fixed-point algorithm for automated static detection of infinite loops. IEEE 16th International Symposium on High Assurance Systems Engineering (2015), 44–51.
- Zeus: analyzing safety of smart contracts.. In Ndss. 1–12.
- Supervised contrastive learning. Advances in neural information processing systems 33 (2020), 18661–18673.
- Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
- Johannes Krupp and Christian Rossow. 2018. {{\{{teEther}}\}}: Gnawing at ethereum to automatically exploit smart contracts. In 27th USENIX Security Symposium. 1317–1333.
- Vulnerability detection with fine-grained interpretations. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 292–303.
- Smart contract vulnerability detection: from pure neural network to interpretable graph feature and expert pattern fusion. arXiv preprint arXiv:2106.09282 (2021).
- Combining graph neural networks with expert knowledge for smart contract vulnerability detection. IEEE Transactions on Knowledge and Data Engineering (2021).
- Ilya Loshchilov and Frank Hutter. 2018. Fixing weight decay regularization in adam. (2018).
- Making smart contracts smarter. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. 254–269.
- Demystifying loops in smart contracts. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 262–274.
- Bernhard Mueller. 2017. A framework for bug hunting on the ethereum blockchain. ConsenSys/mythril (2017).
- sfuzz: An efficient adaptive fuzzer for solidity smart contracts. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 778–788.
- Security analysis methods on ethereum smart contract vulnerabilities: a survey. arXiv preprint arXiv:1908.08605 (2019).
- Cross-Modality Mutual Learning for Enhancing Smart Contract Vulnerability Detection on Bytecode. In Proceedings of the ACM Web Conference. 2220–2229.
- Contrastive Learning of Sentence Representations. In Proceedings of the 18th International Conference on Natural Language Processing.
- Making smart contract development more secure and easier. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1360–1370.
- Sam Roweis. 1997. EM algorithms for PCA and SPCA. Advances in neural information processing systems 10 (1997).
- Blockchain smart contracts formalization: Approaches and challenges to address vulnerabilities. Computers & Security 88 (2020), 101654.
- Towards safer smart contracts: A sequence learning approach to detecting security threats. arXiv preprint arXiv:1811.06632 (2018).
- Smartcheck: Static analysis of ethereum smart contracts. In Proceedings of the 1st international workshop on emerging trends in software engineering for blockchain. 9–16.
- Osiris: Hunting for integer bugs in ethereum smart contracts. In Proceedings of the 34th annual computer security applications conference. 664–676.
- Securify: Practical security analysis of smart contracts. In Proceedings of the ACM SIGSAC conference on computer and communications security. 67–82.
- Smart contract security: A practitioners’ perspective. In IEEE/ACM 43rd International Conference on Software Engineering. IEEE, 1410–1422.
- Esimcse: Enhanced sample building method for contrastive learning of unsupervised sentence embedding. arXiv preprint arXiv:2109.04380 (2021).
- Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3733–3742.
- What should not be contrastive in contrastive learning. arXiv preprint arXiv:2008.05659 (2020).
- Unsupervised embedding learning via invariant and spreading instance feature. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 6210–6219.
- Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. Advances in neural information processing systems 32 (2019).
- SWAV: a web-based visualization browser for sliding window analysis. Scientific Reports 10, 1 (2020), 149.
- Smart contract vulnerability detection using graph neural networks. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence. 3283–3290.