Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Automating SBOM Generation with Zero-Shot Semantic Similarity (2403.08799v1)

Published 3 Feb 2024 in cs.SE and cs.CR

Abstract: It is becoming increasingly important in the software industry, especially with the growing complexity of software ecosystems and the emphasis on security and compliance for manufacturers to inventory software used on their systems. A Software-Bill-of-Materials (SBOM) is a comprehensive inventory detailing a software application's components and dependencies. Current approaches rely on case-based reasoning to inconsistently identify the software components embedded in binary files. We propose a different route, an automated method for generating SBOMs to prevent disastrous supply-chain attacks. Remaining on the topic of static code analysis, we interpret this problem as a semantic similarity task wherein a transformer model can be trained to relate a product name to corresponding version strings. Our test results are compelling, demonstrating the model's strong performance in the zero-shot classification task, further demonstrating the potential for use in a real-world cybersecurity context.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (13)
  1. Lavi Lazarovitz. Deconstructing the solarwinds breach. Computer Fraud & Security, 2021(6):17–19, 2021.
  2. Éamonn Ó Muirí. Framing software component transparency: Establishing a common software bill of material (sbom). NTIA, Nov, 12, 2019.
  3. Software bills of materials for iot and ot devices. IoT Security Foundation, 2023.
  4. The national vulnerability database (nvd): Overview, 2013-12-18 2013.
  5. Davs: Dockerfile analysis for container image vulnerability scanning. Computers, Materials & Continua, 72(1), 2022.
  6. New version, new answer: Investigating cybersecurity static-analysis tool findings. In 2023 IEEE International Conference on Cyber Security and Resilience (CSR), pages 28–35. IEEE, 2023.
  7. Large scale legal text classification using transformer models. arXiv preprint arXiv:2010.12871, 2020.
  8. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  9. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084, 2019.
  10. Zero-shot learning with semantic output codes. Advances in neural information processing systems, 22, 2009.
  11. Zero-shot learning via semantic similarity embedding. In Proceedings of the IEEE international conference on computer vision, pages 4166–4174, 2015.
  12. NTIA, 2021.
  13. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144, 2016.

Summary

We haven't generated a summary for this paper yet.