Evaluating LLaMA 3.2 for Software Vulnerability Detection (2503.07770v1)

Published 10 Mar 2025 in cs.LG, cs.AI, cs.CR, and cs.SE

Abstract: Deep Learning (DL) has emerged as a powerful tool for vulnerability detection, often outperforming traditional solutions. However, developing effective DL models requires large amounts of real-world data, which can be difficult to obtain in sufficient quantities. To address this challenge, DiverseVul dataset has been curated as the largest dataset of vulnerable and non-vulnerable C/C++ functions extracted exclusively from real-world projects. Its goal is to provide high-quality, large-scale samples for training DL models. However, during our study several inconsistencies were identified in the raw dataset while applying pre-processing techniques, highlighting the need for a refined version. In this work, we present a refined version of DiverseVul dataset, which is used to fine-tune a LLM, LLaMA 3.2, for vulnerability detection. Experimental results show that the use of pre-processing techniques led to an improvement in performance, with the model achieving an F1-Score of 66%, a competitive result when compared to our baseline, which achieved a 47% F1-Score in software vulnerability detection.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Evaluating LLaMA 3.2 for Software Vulnerability Detection (2503.07770v1)

Summary

Related Papers