Automatic Vulnerability Triaging
- Automatic vulnerability triaging is a process that algorithmically prioritizes, categorizes, and routes vulnerabilities to streamline risk assessment and remediation.
- It integrates techniques like supervised transformers, multi-task learning, and explainable AI to quantitatively assess severity and assign actionable risk scores.
- By incorporating human-in-the-loop feedback and continuous retraining, these systems improve efficiency and consistency in processing heterogeneous vulnerability data.
Automatic vulnerability triaging is the process of algorithmically prioritizing, categorizing, and routing software vulnerabilities to facilitate efficient risk assessment and timely remediation. As the volume of vulnerability disclosures via feeds such as the CVE program, GitHub advisories, and vendor emails has exceeded human processing capacity, automated triage systems have become critical for modern security operations. Such systems integrate supervised machine learning, natural language processing, static and dynamic program analysis, and explainable AI to reduce manual workload, increase consistency in severity assessment, and improve response times.
1. Core Principles and Objectives
The automation of vulnerability triaging aims to address three core challenges: (1) high alert volume, (2) heterogeneity of input data, and (3) the necessity to prioritize vulnerabilities based on actionable risk metrics. Automated triage systems typically address:
- Severity estimation: Assigning a quantified or discrete severity score (e.g., CVSS base score, NIST VDO class) from incomplete or initially unscored reports.
- Categorical mapping: Identifying the root cause (e.g., CWE class), attack vector, or likely impact based on textual or code features.
- Prioritization: Ordering findings for remediation, filtering high-false-positive or “untrustworthy” warnings, and routing to teams with relevant expertise.
- Human-in-the-loop feedback: Allowing manual overrides, expert corrections, and continual re-training.
These goals are operationalized through strictly defined formal metrics, workflow integration points, and continuous retraining strategies (Bonhomme et al., 4 Jul 2025, Torkamani et al., 31 Jan 2025, Tung et al., 19 Mar 2025).
2. Machine Learning Models and Architectures
Automated triage leverages a range of ML/NLP models, increasingly dominated by transformer-based and LLM approaches.
Fully-Supervised Transformers
Systems such as VLAI utilize RoBERTa-base encoders with a classifier head, fine-tuned on hundreds of thousands of real-world vulnerability descriptions to predict one of four severity classes directly from text (Bonhomme et al., 4 Jul 2025). The model processes input text up to 512 tokens, applies a softmax over class logits, and outputs both class and confidence score. VLAI achieves 82.8% accuracy and a macro-F1 of ≈0.82 on held-out test data, with class-specific F1 scores of 0.80–0.84 and <1% complete misclassification between extremal classes.
Multi-Task and Multi-Objective Learning
AIBugHunter employs a CodeBERT-based encoder with two separate heads for CWE-ID (fine-grained) and CWE-Type (coarse-grained) prediction, jointly optimized via multi-objective (Pareto-optimal) strategies. A separate regression head predicts the floating-point CVSS score (Fu et al., 2023). The architecture allows simultaneous prediction of root cause and severity, with observed CWE-ID accuracy of 0.65 (88 classes) and severity class accuracy of 0.71.
LLMs in Triage
CASEY and CVE-LLM exploit instruction-tuned LLMs, fine-tuned or in-context prompted on extended, ontology-enriched vulnerability records (Torkamani et al., 31 Jan 2025, Ghosh et al., 21 Feb 2025, Ghosh et al., 19 Jul 2024). These models generate multi-label, free-text, or structured outputs, and integrate enterprise-specific assets, existing mitigations, and ground-truth assessment fields. CVE-LLM demonstrates micro-F1 up to 0.96 for vector prediction and 0.94 for binary VEX category, with inference times reduced to sub-second via vLLM for high-throughput triage.
Explainable and Trustworthy Triaging
UntrustVul represents a new trend addressing model trustworthiness, detecting alerts where the model’s explanation highlights code lines that are not related to the actual root cause. It parses the code into a program dependence graph, employs deep syntax representations via transformer ensembles (CodeBERT, GraphCodeBERT, UniXcoder), and computes a trustworthiness score based on intersection over union between model-explained lines and ground-truth vulnerable lines. This filters untrustworthy alerts with F1=0.82–0.94 and supports feedback-driven retraining to improve model grounding (Tung et al., 19 Mar 2025).
3. Data Sources, Preprocessing, and Labeling
Automated triage pipelines are contingent on the ingestion and preprocessing of large, heterogeneous, and frequently updated data sources:
- Vulnerability Feeds: CVE/NVD, GHSA, PyPI advisories, CSAF-format vendor feeds (e.g., Red Hat, Cisco).
- Annotations: CVSS v2/v3/v3.1 scores, CWE labels, NIST VDO characteristics, human-curated comments, vendor-specific triage outputs (e.g., VEXCategory, VEXJustification).
- Corpus Dynamics: Systems such as VLAI and CVE-LLM rebuild labeled datasets daily in Parquet format, continuously ingesting newly published scores and developing a progressively up-to-date training regime (Bonhomme et al., 4 Jul 2025, Ghosh et al., 21 Feb 2025).
- Preprocessing: Cleaning steps remove duplicate or malformed entries, tokenize or stem texts, standardize CVSS vectors to English statements, and augment with explicit ontology links (e.g., mapping CVEs to CWE and CAPEC via a SEPSES/CSKG knowledge graph) (Ghosh et al., 21 Feb 2025).
- Splitting and Labeling: Critical to retain class imbalance for realism; labels are assigned via CVSS binning, expert overrides, or ontology-driven enrichment (Bonhomme et al., 4 Jul 2025, Ghosh et al., 21 Feb 2025).
A summary table on data and labeling approaches:
| System | Data Volume | Label Types | Update Frequency |
|---|---|---|---|
| VLAI | 610k+ vulnerabilities | CVSS (binned), CPEs | Daily |
| CVE-LLM | 350k–750k inputs | VEXCategory, Justification, vector, comments | Daily |
| AIBugHunter | 188k+ functions | CWE-ID, CWE-Type, CVSS | Static |
| UntrustVul | 155k predictions | Trustworthiness (IoU-based) | Static |
4. Automation of Severity, Categorization, and Impact Assessment
Severity assignment and vulnerability characterization are central to triaging. Approaches include:
Severity Prediction
- Text-Based Severity Prediction: Models such as VLAI directly predict severity class (low, medium, high, critical) from unstructured description, outperforming prior BERT-based approaches (Bonhomme et al., 4 Jul 2025).
- Code-Based Severity and Classification: AIBugHunter’s CodeBERT regressor directly outputs CVSS v3.1 scores; compositional analysis frameworks extract code graph features for each function and train per-base-metric predictors for CVSS3 scoring (Ognawala et al., 2018).
- LLM-Prompted Severity and Root Cause: CASEY fine-tunes GPT-3 to annotate both CWE(s) and CVSS score/category, using prompt engineering and code-level context for increased accuracy—achieving 68% CWE and 73.6% severity identification accuracy (Torkamani et al., 31 Jan 2025).
Impact/TTP Mapping
TRIAGE (Høst et al., 25 Aug 2025) applies both rule-based (MITRE’s CVE Mapping Methodology) and in-context learning LLMs to map CVEs to ATT&CK techniques, producing ranked lists of exploitation, primary impact, and secondary impact TTPs. The hybrid approach achieves mean average precision of 0.40 (exploit) versus 0.32–0.35 for rule-based baselines.
Characteristic Assignment
Automated VDO classifiers label new CVEs by root-cause category, supporting prioritization and automated routing. SVM and ensemble methods attain macro-F1 ≈0.80–0.82 across 19 VDO classes (Gonzalez et al., 2019).
5. Trust, Explainability, and Human-in-the-Loop
Reliability, explainability, and human-guided correction are essential for production-grade triaging systems:
- Trustworthiness Assessment: UntrustVul formally quantifies alert trust using (1) syntactic similarity to historical non-vulnerable lines, and (2) PDG-based vulnerability reachability, allowing automated filtering of misleading predictions (Tung et al., 19 Mar 2025).
- Explainable Outputs: Plans for VLAI and similar models include integrating attention or saliency-based highlighting to expose which tokens or sentences drove severity assessment (Bonhomme et al., 4 Jul 2025).
- Expert Feedback Loops: Interactive UIs (e.g., for compositional triagers, AIBugHunter) allow override and correction at the level of severity, cause, or mitigation, with corrections (e.g., in CVE-LLM) continuously reintegrated into retraining (Ognawala et al., 2018, Fu et al., 2023, Ghosh et al., 21 Feb 2025, Ghosh et al., 19 Jul 2024).
- Human-in-the-loop Assignment: Automated systems queue potential “Affected” vulnerabilities for immediate senior review while auto-filtering lower-priority or “NotAffected” cases, thus maximizing both efficiency and safety (Ghosh et al., 19 Jul 2024).
6. Deployment, Pipeline Integration, and Limitations
Modern triaging systems are designed for scalable, low-latency integration across security pipelines:
- Service Integration: Models such as VLAI are served by microservices (e.g., FastAPI), delivering predictions via REST APIs for direct integration into ticketing, workflow, or patching systems (Bonhomme et al., 4 Jul 2025).
- Continuous Retraining/Updating: High-frequency ingestion and retraining pipelines (daily in VLAI, CVE-LLM) enable adaptation to changing nomenclature and emerging attack classes (Bonhomme et al., 4 Jul 2025, Ghosh et al., 21 Feb 2025).
- Scalability: With GPU-accelerated serving frameworks (vLLM), batch triage can process hundreds of thousands of vulnerabilities per day with per-evaluation latencies of 0.04–1 s, compared to minutes for human analysts (Ghosh et al., 21 Feb 2025, Ghosh et al., 19 Jul 2024).
- Use in Domain-Specific Contexts: CVE-LLM variants support medical device portfolios, handling asset-specific configurations, legacy software, and multiple product lines. However, transfer to other domains may require retraining (Ghosh et al., 19 Jul 2024).
Known Limitations
- Data Bias and Underrepresentation: Most current datasets exhibit class imbalance (e.g., for rare categories of root cause, environmental vectors).
- Susceptibility to Adversarial Inputs: Models can be manipulated by carefully crafted, ambiguous, or intentionally misleading textual input (Bonhomme et al., 4 Jul 2025).
- Monolingual Models: Most production triagers process only English; robust multilingual coverage is an open research direction (Bonhomme et al., 4 Jul 2025).
- Incomplete Exploitability Modeling: Dynamic, context-specific factors influencing exploitability (e.g., runtime configuration, environmental defenses) are only partially modeled in automated triage; systems like Autosploit address this with active configuration search (Moscovich et al., 2020).
- Hallucination and Trust: Free-text and LLM-generated fields occasionally hallucinate product names or misattribute severity (Ghosh et al., 21 Feb 2025, Ghosh et al., 19 Jul 2024); rule-based postprocessing is currently needed.
7. Comparative Evaluation and Future Directions
Comparative results demonstrate clear advances but also chart open fronts:
- Quantitative Benchmarks:
| System | Severity Acc. | Macro-F1 | CWE/Root-Cause | Notes | |----------------------|---------------|----------|---------------|-------------------------------------| | VLAI (Bonhomme et al., 4 Jul 2025) | 82.8% | 0.82 | — | Fast retraining, daily increments | | CASEY (Torkamani et al., 31 Jan 2025) | 73.6% | — | 68% CWE | LLM/fine-tuned, flexible context | | AIBugHunter (Fu et al., 2023) | 0.71 (class) | — | 0.65 CWE-ID | Integrated IDE, real-time | | UntrustVul (Tung et al., 19 Mar 2025)| — | 0.82–0.94 (trust) | — | Trust evaluation for filtering | | CVE-LLM (Ghosh et al., 21 Feb 2025) | 0.94 (F1) | 0.96 | vector, VEXClass| Ontology-enriched LLM, domain-tuned| | CVE-LLM (Ghosh et al., 19 Jul 2024) | 0.93 (F1) | 0.95 | — | Med device specific, HIL |
- Research Trends:
- Expansion of multilingual and multimodal triaging.
- Integration with knowledge graphs (SEPSES, MITRE ATT&CK) for context-aware enrichment.
- Advanced explainability (attention, saliency, program graphs).
- Higher-fidelity exploitability modeling (e.g., configuration search via Autosploit) (Moscovich et al., 2020).
- Standardization of open, large-scale, multi-label triage benchmarks for consistent comparison.
- Human-in-the-loop Synergies:
- Persistent hybrid pipelines (automated triage, expert review, incremental retraining) support robust adaptation, providing safety margins against both automation errors and domain shift (Ghosh et al., 19 Jul 2024, Ghosh et al., 21 Feb 2025).
- Templates and rules for postprocessing mitigate risks of LLM hallucinations and malformed fields.
In conclusion, automatic vulnerability triaging encompasses a comprehensive suite of techniques—from transformer- and LLM-based severity and cause prediction to explainable, trustworthy alert filtering and dynamic exploitability evaluation. While current systems demonstrate strong empirical performance and enterprise-scale deployment is established in several verticals, ongoing improvements in data diversity, model grounding, and explainability remain pivotal for broader adoption and ultimate efficacy (Bonhomme et al., 4 Jul 2025, Torkamani et al., 31 Jan 2025, Tung et al., 19 Mar 2025, Ghosh et al., 21 Feb 2025, Ghosh et al., 19 Jul 2024).