Complexity and Vulnerability Metrics

Updated 14 April 2026

Complexity and vulnerability likelihood metrics are quantitative measures that evaluate risk and exploitability by integrating technical effort and real-world exploit probability.
Key methodologies include standardized frameworks like CVSS, code-structure analysis, machine learning predictors, and network entropy metrics.
These metrics enable actionable risk prioritization and effective security management across diverse domains such as software systems, smart contracts, and AI models.

Complexity and vulnerability likelihood metrics provide principled, quantitative mechanisms for evaluating the risk, exploitability, and security posture of software systems, network communities, machine learning models, spreadsheets, and smart contracts. These metrics formalize the intuition that the ease with which an adversary can exploit a flaw (complexity) and the probability or likelihood of such exploitation (vulnerability likelihood) are core drivers of prioritization in vulnerability management. The domain is characterized by standardized frameworks such as CVSS, a proliferation of code-structure-derived complexity indices, learning-based exploit predictors, and system-level composite metrics unifying structural and behavioral risk dimensions (Jiang et al., 16 Feb 2025).

1. Core Concepts: Complexity and Vulnerability in Security Metrics

Complexity, in the vulnerability context, quantifies the technical effort required by an adversary to successfully exploit a bug. This is distinct from programmatic complexity (e.g., cyclomatic complexity), focusing instead on attacker constraints, environmental prerequisites, and preconditions outside the attacker’s direct control (Jiang et al., 16 Feb 2025). In standardized systems (notably CVSS v3), Attack Complexity (AC) serves as the primary axis, operationalized as either “Low” (AC=L, 0.77) or “High” (AC=H, 0.44), reflecting the ease of exploitation given environmental assumptions.

Vulnerability likelihood metrics, or exploitability indicators, aim to estimate the probability that a vulnerability will actually be exploited in the wild. These span static mappings (as in CVSS Exploitability Sub-Score calculations), learned probabilities from data (e.g., EPSS logistic regression fits), survival-analysis, and contemporary neural or ensemble-based models (Jiang et al., 16 Feb 2025).

2. Major Families of Complexity Metrics

2.1 Security-specific Complexity Indices

The CVSS framework operationalizes complexity as AC, which multiplies with Attack Vector (AV), Privileges Required (PR), and User Interaction (UI) to produce an Exploitability sub-score:

$E = 8.22 \times AV \times AC \times PR \times UI$

where each factor is mapped to a fixed scalar in [0,1] range. A lower AC (easier exploitation) produces proportionally higher Exploitability and final CVSS Base Score (Jiang et al., 16 Feb 2025, Allodi et al., 2013, Longueira-Romero et al., 2021). This formulation is directly supported by empirical studies showing that vulnerabilities with low AC dominate real-world exploit datasets; high-AC vulnerabilities are rarely exploited (Allodi et al., 2013, Allodi et al., 2018).

2.2 Code-Structure Complexity Metrics

A widely-used class includes cyclomatic complexity, loop nesting, function parameters, fan-in/out, nesting depth, pointer usage, and coupling. These metrics—grounded in classic program analysis (Weissberg et al., 23 Sep 2025, Du et al., 2019, Tehrani et al., 2024, Shudrak et al., 2018)—are calculated via static analysis of ASTs or binary CFGs and have been shown to correlate (though often weakly) with vulnerability occurrence.

Table: Representative Code Complexity Metrics Used in Vulnerability Assessment

Metric Family	Example Metric (C-code/Smart Contract)	Definition/Formula
Cyclomatic	McCabe: $v(G) = \|E\| - \|V\| + 2P$	# independent paths in CFG
Loop Complexity	C2: #loops, C3: #nested pair, C4: depth	Count and depth of loop constructs
Structural/Nesting	C6: nested control structures	# of control structures within others
Coupling	CBO: coupling between contracts/classes	# external modules used/called
Size	SLOC, NOS, WMC	Lines, statements, weighted methods
Attributes/Params	NA, NUMPAR	# state vars, # function parameters

For example, LEOPARD uses a two-step process: binning functions by total (cyclomatic + loop) complexity, then ranking within bins by an 11-metric vulnerability score (parameters, pointer ops, control-flow intricacy); this yields high empirical coverage of known vulnerabilities (Du et al., 2019, Weissberg et al., 23 Sep 2025).

2.3 Community and Network Complexity

Network vulnerability metrics utilize entropy-based (Tsallis structure entropy) measures reflecting the diversity of degree, betweenness, and internal connectivity within a community, augmented by external similarity (KL divergence across communities) and edge counts (Wen et al., 2019). Composite vulnerability for network components integrates these via multiplicative or weighted formulas.

3. Vulnerability Likelihood and Predictive Metrics

3.1. Static and Logistic-Regression-based Scores

The Exploit Prediction Scoring System (EPSS) is a dominant predictive metric, using a logistic regression over CVSS, vendor, exploit evidence, and real-time signals to output $p_{\mathrm{EPSS}} \in (0,1)$ :

$p_{\mathrm{EPSS}}(x) = \frac{1}{1 + \exp(-(\beta_0 + \sum_i \beta_i x_i))}$

A higher $p_{\mathrm{EPSS}}$ indicates a greater short-term likelihood of exploitation. Empirical studies calibrate these models using large historical datasets of NVD, ExploitDB, exploit markets, and social media (Jiang et al., 16 Feb 2025).

3.2. Machine Learning and Neural Predictors

Predictive Time-To-Exploit (PTTE) applies ML (e.g., RNNs, time series) to forecast the appearance of functional exploits (Jiang et al., 16 Feb 2025). Neural approaches such as V-REx use multilayer neural networks and genetic algorithms to regress probability of imminent exploit.

3.3. Empirical Impact and Composite Likelihood

Case-control studies matching vulnerabilities by impact show that adding exploit-availability signals (exploit in black-market kit, public proof-of-concept) to complexity metrics sharply increases risk-reduction specificity (Allodi et al., 2013). In “Attack Potential in Impact and Complexity” (Allodi et al., 2018), a practical estimator is:

$E[pA(v)] = \log_{10}(\operatorname{ImpactScore}(v)) \times \operatorname{ComplexityScore}(v)$

with thresholds yielding HIGH/MEDIUM/LOW prioritization closely matching real-world exploitation distribution.

4. Domain-Specific Complexity and Vulnerability-Likelihood Metrics

4.1. Smart Contracts

In Solidity, 21 complexity metrics (size, control-flow, inheritance, coupling) drawn from Solmet quantify risk factors for code that is hard to verify and maintain. Cluster analysis reveals metric redundancy; combined ensemble classifiers (Random Forest, AdaBoost) achieve strong AUC/F1, outperforming any single-metric threshold for distinguishing vulnerable from neutral contracts (Tehrani et al., 2024).

4.2. Spreadsheets

Spreadsheet complexity metrics encompass formula size, depth/nesting, reference dispersion, fan-in/out, and cell-cascade reachability. Higher complexity per cell correlates with increased adjusted cell-error rates; Bregar proposes per-cell error adjustment $e_i = e_{\mathrm{base}} \cdot g(C_i)$ , directly linking measured complexity to error (vulnerability) likelihood (0802.3895).

4.3. AI and Deep Learning

For AI system security, three core metrics are advanced: System Complexity Index (SCI, proxying Kolmogorov complexity via compression), Lyapunov Exponent for AI Stability (LEAIS, quantifies divergence under perturbation), and Nash Equilibrium Robustness (NER, assesses resistance to strategic adversarial actions). A composite score,

$V(A) = \alpha SCI(A) + \beta LEAIS(A) + \gamma [1 - NER(A)]$

operationalizes multi-factor vulnerability likelihood for AI models, validated empirically against adversarial robustness and formal verification baselines (Kereopa-Yorke, 2024).

5. Composite and Aggregation Strategies

Most mature vulnerability assessment methodologies combine multiple static and dynamic factors. Taxonomy tables—see Table 2 in (Jiang et al., 16 Feb 2025)—categorize metrics by exploitability and predictive indicator type, and specify associated parameter ranges.

For aggregation, workflows commonly:

Compute code complexity metrics and/or exploitability sub-scores for all functions/vulnerabilities.
Use ML or ensemble meta-models (as in SCM-based pipelines for LLM vulnerability prediction (Weissberg et al., 23 Sep 2025)) to learn optimal weightings.
Augment with dynamic or contextual signals (exploit availability, user base, vendor response time).
Produce a normalized or multi-dimensional vulnerability likelihood/probability ranking to guide triage, patching, fuzzing, or audit prioritization (Du et al., 2019, Shudrak et al., 2018, Tehrani et al., 2024).

6. Empirical Performance, Limitations, and Recommendations

Empirical literature consistently shows that low Attack Complexity and broad exposure (large attack surface, high coupling, deep dependency chains) are strong but not exclusive predictors of exploitation (Allodi et al., 2013, Allodi et al., 2018, Hathaway et al., 2017). Likelihood models based solely on static code metrics tend to plateau; state-of-the-art classification F1 for LLMs and human-derived code metrics is ≈20% on large benchmarks, indicating a limitation of shallow, syntactic measures (Weissberg et al., 23 Sep 2025). Ensemble models that incorporate dynamic vulnerability factors, behavioral metrics, and contextual observables outperform single-dimensional or purely code-structure-based predictors.

Best practices include:

Using standardized weights (e.g., CVSS multipliers) for cross-tool comparability (Jiang et al., 16 Feb 2025, Longueira-Romero et al., 2021).
Extending static metrics with exploit availability signals and temporal features (Allodi et al., 2013, Jiang et al., 16 Feb 2025).
Pruning redundant complexity metrics via inter-metric clustering (Tehrani et al., 2024).
Calibrating composite scores against empirically observed exploitation data.
Utilizing module-specific risk aggregation (e.g., per-layer vulnerability in neural nets (Ahmadilivani et al., 2023), per-community entropy in networks (Wen et al., 2019)).

7. Outlook and Research Challenges

Key open challenges include robust cross-domain calibration, increased specificity of likelihood metrics, and dynamic, context-aware assessment frameworks (i.e., metrics sensitive to deployment context, vulnerability lifecycle, and attacker incentives) (Jiang et al., 16 Feb 2025). Integration of game-theoretic and dynamical systems analysis for complex systems (as in SCI/LEAIS/NER for AI) is a frontier approach (Kereopa-Yorke, 2024). Finer-grained integration of behavioral, software-structural, and contextual metrics offers the main avenue for advancing actionable vulnerability likelihood estimation.