Victim Score: A Cross-Domain Metric

Updated 4 July 2026

Victim Score is a family of metrics that quantifies victim exposure, damage, and risk using both operational state tracking and statistical probabilities.
It is applied across domains such as cybersecurity, system safety, public policy, and trauma, with tailored measures like counters, likelihood estimates, and vulnerability indices.
Effective victim scoring requires aligning the score's semantics with the actual harm mechanism, ensuring accurate calibration, fairness, and actionable decision support.

“Victim Score” does not denote a single standardized quantity across the research literature surveyed here. Instead, it names or implies a family of victim-oriented measurements that quantify exposure, damage, targeting likelihood, credibility, triage priority, or direct physical risk from the standpoint of the threatened or affected entity. In some works the score is explicit and operational, such as a per-victim-row hammered counter in RowHammer mitigation, a per-destination cardinality estimate for DDoS victim identification, or a classifier posterior for the victim role in harmful memes. In other works, the score is an interpretive layer imposed on existing metrics, such as CVSS base attributes viewed from the target’s perspective or ransomware targeting confidence mapped to a 10-point adversary-specific risk scale (Kim et al., 22 Apr 2026, Jain et al., 27 Apr 2026, Ding et al., 2021, Sharma et al., 2023, Gueye et al., 2021, Massengale et al., 6 Feb 2025).

1. Conceptual scope and recurrent forms

Across domains, victim-oriented scoring appears in several distinct forms.

Domain	Victim-oriented quantity	Reference
Software vulnerability assessment	Victim-facing interpretation of CVSS base metrics	(Gueye et al., 2021)
Malware targeting	Profile-conditioned infection likelihood	(Labrèche et al., 2022)
RowHammer mitigation	Per-victim-row hammered count $cnt[v]$	(Kim et al., 22 Apr 2026)
RowHammer mitigation	Rowhammer Vulnerability Count $counter[v]$	(Jain et al., 27 Apr 2026)
DDoS defense	Per-destination cardinality estimate $\hat{E}_{dst}$	(Ding et al., 2021)
Harmful-meme role labeling	Per-entity victim posterior	(Sharma et al., 2023)
Gun-violence prevention	VIPAR vulnerability index	(Ozer et al., 2020)
Trauma triage	Injury Severity Score	(Dehouche, 2021)
Speaker verification attack	Target-system similarity score $s_T(v)$	(Hwang et al., 3 Mar 2026)
Ransomware prioritization	Entity–adversary risk on a 0–9 scale	(Massengale et al., 6 Feb 2025)

The surveyed literature exhibits three recurrent patterns. First, some victim scores are state variables inside operational systems, where the score is updated online and directly triggers mitigation. Second, some are statistical or learned outputs, such as probabilities, logits, Q-values, or confidence scores that rank entities by expected victimhood or victim framing. Third, some are decision-support aggregates, where heterogeneous inputs are collapsed into a severity or vulnerability index for triage, intervention, or prioritization.

A second axis of variation concerns semantics. In engineering systems, the score often measures a physically meaningful accumulation process on the victim itself. In classifier-based systems, it usually measures conditional likelihood or posterior belief. In public policy and medical settings, it often functions as a ranking device even when its cardinal interpretation is contested. This suggests that “Victim Score” is best understood as a cross-domain design pattern: a mapping from victim-relevant evidence to a scalar or ordered quantity used for ranking, thresholding, or resource allocation.

2. Cybersecurity risk and attack-target assessment

In software security, a victim-oriented score is often constructed from exploitability and impact descriptors rather than from explicit victim outcomes. The historical CVSS study over National Vulnerability Database records from 2005 to 2019 does not introduce a new “Victim Score,” but it makes clear which base metrics dominate victim exposure: vulnerabilities are overwhelmingly reachable over the network, are dominantly low in complexity, require little or no authentication or privileges, and usually require limited user interaction; in CVSS v3, Scope is predominantly Unchanged and Confidentiality, Integrity, and Availability impacts are predominantly High (Gueye et al., 2021). The same study also states that CVSS comprises base, temporal, and environmental metric groups, but analyzes only base metrics because extensive temporal datasets do not exist and environmental scores are organization-specific. A victim-facing severity notion derived from this evidence therefore emphasizes remote reachability, low attack preconditions, and the distinction between severe component-level damage and cross-boundary propagation.

Malware downloader analysis makes the victim orientation more explicit by conditioning delivered payloads on machine profile. The downloader study executes 151,189 runs over 12 months, varies operating system, keyboard layout, display language, browser history/session, and VPN location, and measures per-profile infection ratios. It directly supports a probability-style victim score,

$S(v,t,f,p)=P(\text{payload}=p\mid \text{profile}=v,\text{family}=f,t),$

estimated from infection ratios by feature value, time, downloader family, and payload family (Labrèche et al., 2022). The empirical signal is strongly profile dependent: Windows XP yields fewer than half the infections of Windows 7 or 10; Tovkater’s Adware delivery is elevated for the News browser profile, with more than twice the mean infection ratio of Business and Health in spring 2018; Banload’s Banload payloads are elevated for Portuguese keyboard layouts, again with more than twice the ratio of Chinese and Russian; and Tovkater records zero Adware infections for the Chinese keyboard layout (Labrèche et al., 2022). The paper also states that combining these single-feature effects into a multi-attribute score is an extrapolation rather than a tested result, because the experimental design varies one feature at a time against a default profile.

Ransomware-targeting work defines perhaps the most direct operational victim score in this domain. For a given organization and ransomware group, a Random Forest classifier estimates the likelihood that the entity is an “unsafe” target from organization features, adversary SKRAM-aligned attributes, MITRE ATT&CK capability counts, and a time-sensitive activity feature computed as

$V_t=\lambda V_{t-1}+(1-\lambda)x_t,$

with $\lambda=0.2$ in the reported experiments (Massengale et al., 6 Feb 2025). The model’s confidence is then converted into a 10-point risk scale from None $(0)$ to Extremely High $(9)$ (Massengale et al., 6 Feb 2025). The reported test performance on the balanced synthetic-augmented dataset is 99% precision, 99% recall, and 99% F1-score, with a confusion matrix showing 807 true negatives, 820 true positives, 0 false positives, and 13 false negatives (Massengale et al., 6 Feb 2025). The paper simultaneously cautions that these results depend on public victim disclosures, synthetic data heuristics, and a random rather than temporal split.

Speaker-verification attacks use the phrase in yet another sense: the “victim score” is the target system’s similarity assigned to an adversarial audio sample when the attacker claims a fixed enrolled identity. The paper defines this score as

$s_T(v)=\langle F_T(v), t_T\rangle,$

where $counter[v]$ 0 is the target speaker encoder and $counter[v]$ 1 is the victim template (Hwang et al., 3 Mar 2026). The attack objective is to maximize this black-box score under a query budget. The central finding is that latent spaces of generic text-to-speech systems are poorly aligned with speaker-discriminative geometry, whereas a feature-aligned inverse model makes victim-score optimization tractable. The reported gains are substantial: the proposed method achieves competitive attack success with about 10 times fewer queries on average, and the subspace-projection-based attack reaches up to 91.65% success using only 50 queries (Hwang et al., 3 Mar 2026). Here the victim score is not a welfare or risk metric at all; it is a similarity score whose increase corresponds to a more successful impersonation of the victim.

3. Victim-centric counting in systems and networks

In RowHammer research, victim-centric scoring is literal: the tracked quantity is the cumulative disturbance imposed on a victim row. PVAC defines an 8-bit hammered counter $counter[v]$ 2 for each row and updates counters according to victim semantics rather than aggressor semantics. On activation of row $counter[v]$ 3, PVAC resets $counter[v]$ 4 and increments $counter[v]$ 5 for each $counter[v]$ 6; if any victim counter reaches threshold $counter[v]$ 7, the row is enqueued and mitigation is triggered (Kim et al., 22 Apr 2026). The key claim is that the score maintained for the endangered row directly matches the physical disturbance mechanism of RowHammer. Because refresh implicitly activates each row once per $counter[v]$ 8, counters are naturally bounded under benign conditions, avoiding the idle-bank saturation and spurious Alerts that arise in PRAC (Kim et al., 22 Apr 2026). Under the same maximum hammered-count safety constraint, PVAC supports larger back-off thresholds than aggressor-counting PRAC, which translates into fewer Alerts, higher hammering tolerance, and lower energy.

RVC develops the same victim-centric principle independently and with a different threshold formulation. It defines the Rowhammer Vulnerability Count as a per-row counter that measures how close a row is to a bit flip. On each access to row $counter[v]$ 9, the mechanism resets $\hat{E}_{dst}$ 0, increments all rows within the blast radius $\hat{E}_{dst}$ 1, and refreshes only those rows whose counts reach threshold $\hat{E}_{dst}$ 2 (Jain et al., 27 Apr 2026). The paper contrasts the aggressor-centric safety condition

$\hat{E}_{dst}$ 3

with the victim-centric condition

$\hat{E}_{dst}$ 4

arguing that direct victim tracking eliminates the need to divide the RowHammer threshold across shared victims (Jain et al., 27 Apr 2026). The reported effect is large: compared with Graphene, RVC achieves 95–99.99% improvement in mitigation-induced refreshes, improves energy efficiency, and reduces average LLC latency by up to 76.91% (Jain et al., 27 Apr 2026).

In programmable-network defense, the victim score becomes a cardinality estimate. BACON and INDDoS estimate, at line rate, how many distinct source flows contact each destination during a time window. For a destination key $\hat{E}_{dst}$ 5, the data plane computes a per-row Bitmap occupancy estimate $\hat{E}_{dst}$ 6 and then takes

$\hat{E}_{dst}$ 7

which serves as the victim-identification signal (Ding et al., 2021). A destination is reported as a DDoS victim when $\hat{E}_{dst}$ 8, where $\hat{E}_{dst}$ 9 is the number of distinct source IPs in the interval, and a digest is emitted at the first exceedance $s_T(v)$ 0 (Ding et al., 2021). The paper explicitly notes that $s_T(v)$ 1 can be interpreted as a continuous victim score and that $s_T(v)$ 2 or $s_T(v)$ 3 are natural normalized forms (Ding et al., 2021). With the default $s_T(v)$ 4 configuration on CAIDA traces, the system reports Recall $s_T(v)$ 5, Precision $s_T(v)$ 6, and F1 $s_T(v)$ 7; on the Booter DNS amplification traces and the mixed four-victim case, it achieves perfect identification (Ding et al., 2021).

A common pattern runs through these systems papers. Victim-centric tracking is most successful when the score measures the quantity that physically or operationally matters on the victim side itself: disturbance on a DRAM row, or incoming source diversity at a destination host. In both settings, moving the score from aggressor-side proxies to victim-side state improves threshold semantics and reduces unnecessary mitigation.

4. Human victims, triage, and public-safety prioritization

In public-safety prediction, VIPAR is a rule-based vulnerability index designed to rank individuals by risk of future shooting victimization. The score combines age, police-contact-derived criminal-history indicators, first-degree network exposures, a PageRank-like centrality term, and structural violence characteristics of the ego-network (Ozer et al., 2020). The weighting scheme is explicit: age contributes $s_T(v)$ 8; recent firearm-related crime contributes $s_T(v)$ 9; first-degree exposure to a high-PageRank individual contributes $S(v,t,f,p)=P(\text{payload}=p\mid \text{profile}=v,\text{family}=f,t),$ 0; first-degree exposure to a CIRV list member contributes $S(v,t,f,p)=P(\text{payload}=p\mid \text{profile}=v,\text{family}=f,t),$ 1; and several group-level violence thresholds add between $S(v,t,f,p)=P(\text{payload}=p\mid \text{profile}=v,\text{family}=f,t),$ 2 and $S(v,t,f,p)=P(\text{payload}=p\mid \text{profile}=v,\text{family}=f,t),$ 3 points (Ozer et al., 2020). When computed on 2010–2014 data and evaluated against 2015 outcomes, the top 3,215 ranked individuals capture 25.8% of future shooting victims and 32.2% of known shooting suspects, compared with 13% and 9.4% for the CIRV list (Ozer et al., 2020). The score is therefore explicitly victim-oriented, but it is also policy sensitive: the paper emphasizes transparency, exclusion of race and place, and the presumption-of-innocence concerns associated with proactive ranking.

Trauma research offers a more classical severity score, though the cited paper questions its measurement validity. The Injury Severity Score aggregates the three highest Abbreviated Injury Scale values from distinct body regions as

$S(v,t,f,p)=P(\text{payload}=p\mid \text{profile}=v,\text{family}=f,t),$ 4

with range $S(v,t,f,p)=P(\text{payload}=p\mid \text{profile}=v,\text{family}=f,t),$ 5 to $S(v,t,f,p)=P(\text{payload}=p\mid \text{profile}=v,\text{family}=f,t),$ 6 (Dehouche, 2021). The score is widely used to rank multiple-injury trauma victims, but the paper argues that both AIS and ISS are ordinal rather than cardinal and that ISS has problematic axiomatic properties. Among the reported findings are that 55 nontrivial AIS triplets collapse into only 44 distinct ISS values, that ISS is non-injective with respect to mortality in the examined subset, that sum of cubes yields higher normalized mutual information with mortality than ISS, and that ISS can produce rank reversals and independence violations under identical clinical changes (Dehouche, 2021). In this setting, the victim score is established and influential, but its cardinal interpretation is contested.

In emergency-response planning, multi-agent victim tagging defines a different kind of victim-oriented priority. The mass-casualty paper does not define an explicit scalar “victim score,” and victims are largely homogeneous except for location, tag state, and, in one heuristic, a severity variable $S(v,t,f,p)=P(\text{payload}=p\mid \text{profile}=v,\text{family}=f,t),$ 7 with “critical” defined as $S(v,t,f,p)=P(\text{payload}=p\mid \text{profile}=v,\text{family}=f,t),$ 8 (Cardei et al., 2 Mar 2025). The nearest-victim and local-critical-victim heuristics therefore induce rankings rather than compute standalone victim utilities, while FDQN learns implicit per-victim scores through each agent’s Q-values over “select victim $S(v,t,f,p)=P(\text{payload}=p\mid \text{profile}=v,\text{family}=f,t),$ 9” actions (Cardei et al., 2 Mar 2025). The paper reports that FDQN outperforms heuristics in smaller-scale scenarios, whereas heuristics perform better in more complex settings (Cardei et al., 2 Mar 2025). The victim score here is thus latent in the action-value function, not explicit in the task definition.

The NCVS study provides yet another cautionary case. Its logistic models classify whether already-victimized respondents experienced violent rather than property crime, using income, education, employment, age, gender, race, and marital status on balanced 50:50 violent/property subsets (Anuyah, 26 May 2025). The best suburban model reports Accuracy $V_t=\lambda V_{t-1}+(1-\lambda)x_t,$ 0 and F1 $V_t=\lambda V_{t-1}+(1-\lambda)x_t,$ 1, while the best urban full model reports Accuracy $V_t=\lambda V_{t-1}+(1-\lambda)x_t,$ 2 and F1 $V_t=\lambda V_{t-1}+(1-\lambda)x_t,$ 3; rural models are materially weaker (Anuyah, 26 May 2025). Because the analysis conditions on victimization, down-samples the majority class, and does not use NCVS weights, the resulting score is a conditional victim-type score rather than a general victimization-risk score (Anuyah, 26 May 2025). This distinction is fundamental: a victim score can rank among victims without estimating who becomes a victim in the first place.

5. Learned and multimodal victim scoring

In harmful-meme analysis, victim scoring is implemented as per-entity role classification. VECTOR predicts one of four mutually exclusive labels—hero, villain, victim, or other—for each entity referenced in a meme, with the victim score naturally identified as

$V_t=\lambda V_{t-1}+(1-\lambda)x_t,$ 4

The model combines DeBERTa text encoding, ViT visual features, an entity-centric commonsense graph built from ConceptNet, and OTKE-based multimodal fusion (Sharma et al., 2023). On the test set, VECTOR reports victim precision $V_t=\lambda V_{t-1}+(1-\lambda)x_t,$ 5, recall $V_t=\lambda V_{t-1}+(1-\lambda)x_t,$ 6, and F1 $V_t=\lambda V_{t-1}+(1-\lambda)x_t,$ 7, while its best ablation checkpoint reports victim F1 approximately $V_t=\lambda V_{t-1}+(1-\lambda)x_t,$ 8 (Sharma et al., 2023). The paper also notes that a strong text-only DeBERTa-large baseline remains competitive, indicating that victimhood in memes is often recoverable from textual connotation but benefits from entity-specific multimodal grounding.

Judicial-attitude analysis treats victim scoring as an ordinal credibility assessment rather than as a severity or exposure metric. The Hebrew court-study extracts sentences that convey the judge’s stance toward the complainant and classifies them into eight granular labels nested within four high-level categories: Unequivocal credibility, Credible because, Credible but, and Not credible (Habba et al., 2023). The paper does not assign numeric scores, but its ontology is explicitly ordinal and supports a binary recoding into “credible” versus “not-credible” (Habba et al., 2023). On high-level categories, fine-tuned AlephBERT reports 76.8% accuracy, versus approximately 72.8% human inter-annotator agreement; on granular labels it reports 63.7% accuracy versus approximately 54.4% human agreement (Habba et al., 2023). A document-level victim score is therefore a plausible aggregation over extracted sentence labels, but the paper treats the result as a descriptive measure of judicial language rather than ground truth about the complainant.

Victim count extraction from text supplies a different kind of scoring substrate. The crisis-information paper frames death and injury count extraction as question answering with generation, regression, or classification heads and emphasizes calibration through Expected Calibration Error and Quantile Calibration Error (Zhong et al., 2023). NT5-Gen is the strongest exact extractor in the reported comparisons, for example achieving Exact-Match/F1 of $V_t=\lambda V_{t-1}+(1-\lambda)x_t,$ 9 on WAD injuries, compared with $\lambda=0.2$ 0 for the SRL baseline (Zhong et al., 2023). These are not yet victim scores, but they provide calibrated numeric inputs for downstream severity scoring over events, especially when exact counts, bins, and uncertainty must be combined.

Sky-Ear, an unmanned-aerial-vehicle system for search and rescue, exposes a set of victim-detection signals from which a score can be formed, although the paper does not define one scalar explicitly. The Sentinel stage computes an anomaly score $\lambda=0.2$ 1 from Top- $\lambda=0.2$ 2 patchwise MAE reconstruction error on Mel-spectrograms and triggers the Responder when $\lambda=0.2$ 3, with scene-specific thresholds $\lambda=0.2$ 4 in desert and $\lambda=0.2$ 5 in forest (Hong et al., 14 Apr 2026). The Responder then estimates TDoA, DoA, and a multi-observation localization weighted by GCC-PHAT peak magnitudes $\lambda=0.2$ 6 (Hong et al., 14 Apr 2026). The paper reports that the best detection accuracy occurs at low masking ratio $\lambda=0.2$ 7 and that localization error decreases as the UAV approaches the victim and accumulates multiple observations (Hong et al., 14 Apr 2026). A plausible implication is that a composite victim score for SAR would combine anomaly magnitude, temporal persistence, GCC-PHAT confidence, directional consistency, and localization residual.

LLM audits show that victim scoring can also emerge implicitly from allocation behavior. The Identifiable Victim Effect study operationalizes prioritization through hypothetical donations from a fixed \$\lambda=0.2$8d=0.223$\lambda=0.2$9d=0.15$(0)$0d=0.41$(0)$1d=-0.05$; the study also reports psychophysical numbing, perfect quantity neglect, and small but significant cultural-distance effects (Raiyan, 13 Apr 2026). In this setting the victim score is behaviorally revealed rather than explicitly output: higher donation implies higher prioritization.

6. Methodological patterns, limitations, and controversies

A central methodological divide concerns whether the score’s semantics align with the mechanism that produces harm. The strongest engineering examples align closely: PVAC and RVC track disturbance on victim rows rather than aggressor activity, and INDDoS tracks source diversity at the destination rather than relying on centralized post hoc analysis (Kim et al., 22 Apr 2026, Jain et al., 27 Apr 2026, Ding et al., 2021). By contrast, several decision-support scores depend on proxies whose relationship to victim outcomes is indirect. CVSS base metrics omit temporal and environmental dimensions; downloader experiments vary one feature at a time and therefore do not estimate interactions; and ransomware targeting uses public disclosures and synthetic augmentation that may distort the true target distribution (Gueye et al., 2021, Labrèche et al., 2022, Massengale et al., 6 Feb 2025).

Another recurring issue is calibration and measurement level. In the text-extraction study, calibration metrics are treated as first-class evaluation targets because victim counts will be consumed in high-stakes pipelines (Zhong et al., 2023). The NCVS study, however, reports F1 and accuracy on balanced subsets without probability calibration or survey weighting, so any apparent “risk score” is misaligned with real-world base rates (Anuyah, 26 May 2025). The ISS critique is even sharper: it argues that ISS is ordinal, not cardinal, and that standard deviation, covariance, and Pearson correlation are not conceptually valid summaries for that scale (Dehouche, 2021). These cases show that a victim score can be operationally useful while still being statistically misinterpreted.

Fairness and semantic drift are equally important. Judicial credibility labels can reify courtroom biases and rape-myth-inflected language rather than recover a victim’s actual credibility; harmful-meme classifiers inherit annotation and class-imbalance constraints; and LLM-based victim prioritization is highly sensitive to alignment regime and prompt scaffold (Habba et al., 2023, Sharma et al., 2023, Raiyan, 13 Apr 2026). The LLM study is especially instructive because it separates declarative knowledge of bias from allocative behavior: models can explain the Identifiable Victim Effect while still exhibiting it, and standard Chain-of-Thought can amplify rather than correct the bias (Raiyan, 13 Apr 2026). This suggests that victim scoring systems built on generative models require adversarial prompting audits, not only benchmark accuracy.

A final cross-domain lesson is that victim scores are not intrinsically comparable across domains, even when they share a name. A hammered count, a posterior probability, a triage heuristic, a vulnerability index, and a donation amount all order victims, but they do so with different reference classes, thresholds, and normative meanings. The literature therefore supports a restrained generalization: victim-oriented scores are most reliable when their semantics are explicit, their update rules or predictors are tied to the mechanism of harm, their calibration is audited, and their limitations are treated as part of the score definition rather than as peripheral caveats.