Software Bill of Materials (SBOM)

Updated 3 December 2025

SBOM is a machine-readable inventory that lists all software components, dependencies, and metadata, providing a clear view of the software supply chain.
SBOM standards like SPDX and CycloneDX prescribe mandatory fields and serialization formats to enable effective vulnerability management and regulatory compliance.
SBOMs are crucial for supply chain risk management, enhancing artifact integrity through lock-file–based generation, digital signing, and systematic vulnerability analysis.

A Software Bill of Materials (SBOM) is a formal, machine-readable inventory that enumerates all components (including direct and transitive dependencies), their versions, provenance, and associated metadata used in the construction or distribution of a software artifact. By making the software supply chain explicit, SBOMs enable organizations to trace, manage, and assess vulnerabilities, compliance obligations, and the integrity of delivered code, increasingly underpinning both technical security architectures and regulatory regimes across the globe (Zhou et al., 25 Nov 2025).

1. Formal Definition, Standards, and Representation

An SBOM is typically modeled as a tuple:

$\mathrm{SBOM} = (C, R, M)$

where:

$C$ is a set of components (libraries, binaries, packages),
$R \subseteq C \times C$ encodes dependency relationships (direct and transitive),
$M$ maps each $c \in C$ to its metadata (version, supplier, license, cryptographic hash, etc.) (Xia et al., 2023, Mirakhorli et al., 17 Feb 2024).

SBOM standards prescribe data models, mandatory/optional fields, and serializations. The two dominant standards are:

SPDX (Software Package Data Exchange): Emphasizes license metadata, provenance, and rich extensibility. Component records encode names, versions, SPDX IDs, download locations, and license fields (concluded/declared) (Kishimoto et al., 9 Apr 2025).
CycloneDX: Focuses on security contexts, compact encoding, and support for dependency graphs/extensions for VEX (Vulnerability Exploitability Exchange) integration.

Both aim to fulfill the NTIA Executive Order 14028 “Minimum Elements” (supplier name, component name, version, unique ID/purl/CPE, explicit dependencies, author, timestamp) (Mirakhorli et al., 17 Feb 2024, Kishimoto et al., 9 Apr 2025).

2. Applications in Supply Chain Security and Risk Management

SBOMs serve as foundational artifacts in five principal security and assurance use cases (O'Donoghue et al., 4 Jun 2025):

Vulnerability Management: Rapid matching of component/version pairs against public vulnerability databases (NVD, GitHub Advisories). SBOMs enable the computation of component risk exposure and facilitate CVE triage (Zhou et al., 25 Nov 2025).
Transparency and Traceability: SBOMs disclose third-party and open-source dependencies, reducing information asymmetry in procurement, allowing due diligence in critical infrastructure, and satisfying regulatory disclosure requirements (O'Donoghue et al., 4 Jun 2025).
Component Assessment: SBOM records support quality metrics (maintainer activity, update latency), as well as build integrity checks (e.g., SLSA compliance via checksums) (O'Donoghue et al., 4 Jun 2025).
Risk Quantification: By combining SBOMs, CVSS/CWE annotations, and dependency graphs, organizations compute risk scores and surface components with excessive vulnerability surface (O'Donoghue et al., 4 Jun 2025).
Artifact Integrity: SBOMs with cryptographically signed entries or on-chain hash anchors enable verification that binaries and dependencies have not been tampered with during build or distribution (Ozkan et al., 6 Dec 2024, Xia et al., 2023).

These core functions make SBOMs central to proactive defense against supply-chain attacks and regulatory and procurement compliance mandates.

3. Technical Foundations: Generation, Completeness, Correctness

Accurate, complete, and reproducible SBOMs require precise enumeration of both direct and transitive dependencies, including versions resolved post-installation. High-fidelity SBOM generation is contingent on the following (Zhou et al., 25 Nov 2025, Cofano et al., 2 Sep 2024):

Lock-file–centric workflows: SBOMs generated exclusively from lock files (e.g., poetry.lock, Cargo.lock, Gemfile.lock) in “strong” package manager ecosystems provide exact, reproducible dependency snapshots. Both Trivy and Syft achieve perfect recall and Jaccard similarity (1.0) against lock-file ground truth over thousands of repositories (Zhou et al., 25 Nov 2025).
Project-file–based generation: Many tools parse manifests (requirements.txt, package.json) which often omit transitive dependencies, encode weak version constraints, or are susceptible to manual tampering (Cofano et al., 2 Sep 2024, Ozkan et al., 6 Dec 2024).
Binary and non-package-managed ecosystems: SBOM extraction from binaries, filesystem images, or C/C++ source requires hybrid static analysis, code clone detection, or runtime inspection (e.g., UniBOM uses Binwalk, Syft, and a custom CCScanner for these domains) (Safronov et al., 27 Nov 2025, Song et al., 29 Aug 2024).

Critical accuracy metrics:

$\begin{aligned} \text{Recall} &= \frac{|R \cap G|}{|G|} \ \text{Precision} &= \frac{|R \cap G|}{|R|} \ F_1 &= 2 \cdot \frac{\text{Precision} \times \text{Recall}}{\text{Precision}+\text{Recall}} \end{aligned}$

Mean recall/precision for Python SBOM tools varied from 48–75% and 88–94% respectively, with static-file–only workflows yielding lower completeness (Cofano et al., 2 Sep 2024).

SBOM internal trustworthiness reflects both technique (hash verification, canonicalization, digital signatures) and resistance to tampering. The absence of hash validation in most SBOMs renders them vulnerable to attack by malicious insiders who can manipulate manifest files to suppress or falsify dependency versions (Ozkan et al., 6 Dec 2024). Only Maven’s strict POM centrality and hash validation reach high trust scores (Ozkan et al., 6 Dec 2024).

4. Automated Consumption and Vulnerability Analysis: Limits and Advances

Consumption tools—vulnerability and license scanners, analytics dashboards—operate by ingesting SBOMs and querying vulnerability databases using name/version or unique component identifiers (CPE, PURL). However, several systematic issues impair their effectiveness:

False positive flood: Even against perfectly accurate SBOMs, package-level scanners produce a false positive rate up to 97.5%, primarily due to reporting vulnerabilities in code paths never invoked by the application (Zhou et al., 25 Nov 2025).
Reachability analysis: Function call graph overlaying can prune non-exploitable vulnerabilities; static analysis can reduce false alerts by over 60% in empirical studies (Zhou et al., 25 Nov 2025).
Contextuality and VEX: Vulnerability Exploitability eXchange (VEX) documents capture and annotate not_affected/fixed/under_investigation status with rationales per dependency, providing a machine-consumable mechanism for downstream triage (Fucci et al., 18 Mar 2025).
Alert fatigue: Without semantic enrichment (reachability, context), SBOM scans become compliance artifacts rather than actionable security tools, contributing to developer alert fatigue (Zhou et al., 25 Nov 2025).

Best practice pipelines for actionable SBOM-based security are thus two-stage:

Generate ground-truth SBOMs strictly from lock files via strong PMs.
Enrich with reachability analysis before feeding to scanners, ensuring reports are low noise and cover only actual risk (Zhou et al., 25 Nov 2025).

Tool accuracy and ecosystem support remain variable: Empirical toolstudies in firmware and IoT OS codebases reveal only advanced pipelines (e.g., UniBOM) reach 100% recall and precision; common tools often achieve far lower coverage, especially in non-package-managed or binary domains (Safronov et al., 27 Nov 2025).

5. Quality, Completeness, and Real-World Adoption

SBOM quality is measured by completeness of component enumeration, accuracy of metadata, coverage of version/license fields, and conformance to profiles such as SPDX Lite (Kishimoto et al., 9 Apr 2025, Soeiro et al., 19 Mar 2025). The sbomqs tool evaluates SBOM syntactic and semantic validity via five criteria (parseability, root metadata, component entries, version, license), producing a [0,10] score. In a real-world sample, 56.5% of deduplicated SBOMs achieved maximum quality (Soeiro et al., 19 Mar 2025).

Real-world adoption remains low:

Only 0.56% of the most popular GitHub repositories include policy-driven SBOMs.
In Maven Central, SBOM publication is observed in just 0.5% of sampled releases (Gamage et al., 23 Jan 2025, Novikov et al., 1 Sep 2025).
SBOM drift (change or staleness between releases) is a significant threat; most SBOMs are not automatically regenerated nor versioned, exacerbating risk of outdated metadata (Stalnaker et al., 2023).

Security and compliance gaps: 22% of policy-driven SBOMs lack any license metadata for dependencies, creating both legal and audit risk. 61% of dependencies in such SBOMs carry known vulnerabilities, affirming the criticality of continuous SBOM updates and integrated scanning (Novikov et al., 1 Sep 2025).

6. Confidentiality, Integrity, and Selective Disclosure

SBOM confidentiality and integrity are increasingly important as SBOMs become regulatory deliverables and may expose sensitive internal architecture. Solutions deployed or proposed include:

Selective encryption and redaction: Attribute-based encryption (ABE) allows vendors to redact fields in SBOMs, enabling only authorized parties to decrypt fields under specific policy constraints (e.g., Petra system) (Ishgair et al., 16 Sep 2025).
Cryptographic signing and blockchain anchoring: Advanced solutions employ append-only repositories or blockchain ledgers to store hashes or metadata traces of SBOM entries, tying trust back to developer identities (IR/SR repositories, Merkle proofs) (Ozkan et al., 6 Dec 2024, Xia et al., 2023).
Verifiable credentials and partial disclosure: Using W3C Verifiable Credentials, vendors can disclose SBOM subsets or zero-knowledge proofs to verifiers, maintaining confidentiality without losing trust or compliance capability (e.g., selective attribute/Merkle inclusion proofs, ZKP for “need-to-know”) (Xia et al., 2023).
Runtime enforcement: SBOM-derived allow-lists can be used for active integrity checking of modules/classes at runtime, blocking unknown or tampered code (e.g., SBOM.EXE) (Sharma et al., 28 Jun 2024).

7. Limitations, Barriers, and Outlook

Despite technical and standardization progress, several barriers persist:

Tool deficiencies and standardization gaps: Insufficient support for multi-language, binary-only, and hybrid codebases; major differences between SPDX and CycloneDX profile support and field compatibility; immature verification workflows (Stalnaker et al., 2023, Safronov et al., 27 Nov 2025).
Data privacy and information overload: Pressure to publish only partial SBOMs due to fear of competitive or vulnerability information leakage (Stalnaker et al., 2023, Ishgair et al., 16 Sep 2025).
High operational overhead: Manual correction and field completion are common; automated quality control and continuous integration into CI/CD pipelines remains rare (Cofano et al., 2 Sep 2024, Mirakhorli et al., 17 Feb 2024).
Attack surface expansion: Absence of robust hash/signature checks means tampered SBOMs can suppress vulnerabilities without detection (Ozkan et al., 6 Dec 2024).
Alert fatigue and lack of actionable intelligence: Over-reporting of vulnerabilities with no exploitability context degrades trust in SBOM-based scanning, undermining security posture (Zhou et al., 25 Nov 2025).

Emerging research themes: Integration of SBOM analysis with ML-based triage, dynamic runtime telemetry, extended provenance (AIBOM/DataBOM/FirmwareBOM), and systematic public corpora for tool benchmarking are all open fields for future work (O'Donoghue et al., 4 Jun 2025, Soeiro et al., 19 Mar 2025).

Conclusion: SBOMs constitute a central mechanism for rendering the software supply chain explicit, thereby enabling a spectrum of security, compliance, and operational analyses. However, only rigorous, lock-file–based generation, semantic enrichment with reachability or quality data, continuous maintenance, and adoption of confidentiality/integrity mechanisms will realize their full security and compliance promise (Zhou et al., 25 Nov 2025, Ishgair et al., 16 Sep 2025, Ozkan et al., 6 Dec 2024).