Multidimensional Cross Risk Score (MCRS)

Updated 20 November 2025

MCRS is a composite risk metric that integrates category-specific scores with a cross-risk influence matrix to capture both direct and spill-over effects.
It employs ensemble-derived risk scores aggregated with reliability weights, ensuring interpretability through convex combinations that remain bounded.
Empirical validation shows improved human-alignment, with metrics like Spearman correlation increasing from 0.518 to 0.567 in benchmark tests.

A Multidimensional Cross Risk Score (MCRS) is a metric designed to quantify the aggregate risk posed by an entity or system when multiple, potentially correlated risk categories are present. MCRS is particularly relevant in domains where risk manifestation is inherently multidimensional and category boundaries are neither independent nor mutually exclusive—such as content safety in multimodal LLMs (MLLMs), or systemic risk in interconnected financial systems. MCRS models not only the per-category risks but also their semantic or statistical correlations, providing a composite risk score that accounts for both direct and spill-over effects between categories (Yan et al., 13 Nov 2025, Mezei et al., 2016).

1. Formal Structure and Mathematical Definition

In the OutSafe-Bench framework for evaluating MLLMs, a single model output $x$ is mapped to a nine-dimensional vector of raw risk scores,

$R(x) = [ r_1(x), r_2(x), \ldots, r_9(x) ], \quad r_i(x) \in [0, 10]$

where each $r_i$ represents severity in a specified content risk dimension (privacy, bias, crime, ethics, hate, misinformation, politics, health, intellectual property) (Yan et al., 13 Nov 2025).

To capture cross-category amplification and co-occurrence, OutSafe-Bench constructs a cross-risk influence matrix $\Gamma \in [0,1]^{9 \times 9}$ , where each entry $\gamma_{(p,q)}$ encodes the degree to which risk in category $p$ amplifies perceived risk in category $q$ ; rows are normalized: $\sum_{q=1}^{9} \gamma_{(p,q)}=1$ .

Given a scenario index $k$ (selecting a primary risk context), the MCRS is computed as

$\mathrm{MCRS}_k(x) = \sum_{q=1}^9 \gamma_{(k,q)} \, r_q(x)$

This convex combination yields $\mathrm{MCRS}_k(x) \in [0, 10]$ , ensuring interpretability and boundedness.

In Mezei & Sarlin’s RiskRank, a related MCRS framework is given via a 2-additive Choquet integral, allowing for the aggregation of individual risk levels $x_i$ and their pairwise interaction indices $I(c_i, c_j)$ : $\mathrm{RiskRank}_S(x_1,\dots,x_n) = v(c_S)x_S + \sum_{i\neq S} \left(v(c_i) - \frac{1}{2} \sum_{j\neq i} I(c_i,c_j)\right)x_i + \sum_{i \neq S,\, j \neq S,\, i<j} I(c_i,c_j) x_i x_j$ $S$ is a target node, $v(c_i)$ is the Shapley-value weight, and $I(c_i,c_j)$ measures additional risk borne from joint stress (Mezei et al., 2016).

2. Component Breakdown and Computational Approach

Raw Risk Vectors: Each $r_i(x)$ is derived via an ensemble of reviewer models (the "jury"), each scoring output $x$ on the $i$ th dimension ( $[0, 10]$ ). In OutSafe-Bench, these juror model scores are aggregated with reliability-based weights $\lambda_l$ through the FairScore protocol, resulting in an aggregated vector $\hat{r}(x)$ .

Cross-Risk Influence Matrix $\Gamma$ : Each risk category’s description is embedded with sentence-BERT, yielding vectorial representations. Pairwise cosine similarities produce the unnormalized score matrix, which is row-normalized to get $\Gamma$ :

High $\gamma_{(p,q)}$ indicates strong semantic association or co-risk.
$\Gamma$ is computed once at benchmark design and remains static (Yan et al., 13 Nov 2025).

Scenario Indexing and Scalar Reduction: For scenario $k$ , the $k$ th row of $\Gamma$ provides weights for collapsing the risk vector to the scalar $\mathrm{MCRS}_k(x)$ , desired for scenario-specific assessment.

Pseudocode Overview: The computation is efficiently implemented by:

Embedding all categories and computing $\Gamma$ via cosine similarities;
Aggregating model output risk vectors with reliability weights;
Calculating MCRS as the dot product for the selected scenario.

3. Illustrative Example and Interpretation

Consider the case where privacy is the primary scenario ( $k=1$ ) and $\gamma_{(1,\cdot)} = [0.30, 0.10, 0.05, 0.10, 0.05, 0.10, 0.05, 0.15, 0.10]$ . For an output $x$ with risk vector $[6.0, 1.0, 0.0, 2.0, 0.0, 1.0, 0.0, 3.0, 0.0]$ , the MCRS is

$\mathrm{MCRS}_{1}(x) = 0.30 \times 6.0 + 0.10 \times 1.0 + \cdots + 0.15 \times 3.0 = 2.65$

Here, the raw privacy risk $6.0$ is modulated by risk spill-over from correlated domains, reflecting interdependence (Yan et al., 13 Nov 2025).

4. Theoretical Properties

Convexity and Boundedness: MCRS is a convex combination of risks, with weights summing to one and each component confined to $[0,10]$ or $[0,1]$ (RiskRank), ensuring $\mathrm{MCRS}_k(x)$ is always in bounds.

Interpretability: Scores directly reflect a risk-weighted semantic average, where risk in semantically or functionally adjacent categories amplifies the scenario-specific result.

Monotonicity: In the RiskRank formalism, increasing any $r_i(x)$ or co-risk term raises the aggregate risk score, a property that persists in the linear MCRS construction.

Empirical Validation: Ablation on human-annotated subsets of OutSafe-Bench shows that including $\Gamma$ increases human-alignment: Spearman correlation with human judgment improves from 0.518 (unweighted mean) to 0.567 (Yan et al., 13 Nov 2025).

5. Integration in Multimodal Safety and FairScore

MCRS is integrated as part of the OutSafe-Bench's FairScore evaluation system. Here, FairScore first produces an ensemble-aggregated risk vector $\hat r(x)$ per sample. MCRS then reduces this to a scalar reflecting not just the direct risk but also the spill-over from co-occurring, semantically linked risks. This penalizes "near-miss" failures, yielding safety rankings more consistent with human risk perception and facilitating nuanced evaluation of MLLM vulnerabilities (Yan et al., 13 Nov 2025).

6. Comparison and Extension: Relation to RiskRank

RiskRank applies the principles underlying MCRS to system-level risk in financial networks. Here, individual entity risk and pairwise interconnectedness (encoded as a fuzzy measure and 2-additive Choquet integral) are aggregated to yield a systemic risk index. Direct contributions from nodes and interaction effects (via pairwise products or minimums) are both included. This approach generalizes to any multidimensional risk system requiring both additive and joint-failure effect modeling (Mezei et al., 2016).

Generality Table: MCRS versus RiskRank

Aspect	OutSafe-Bench MCRS	RiskRank MCRS (Finance)
Domains	Content risk in MLLMs	Systemic financial risk
Risk categories/ $n$	9 (fixed, semantically set)	$n$ (arbitrary entities)
Interaction structure	Static SBERT-based $\Gamma$	k-additive Choquet integral
Aggregation	Convex sum (linear weights)	Direct+pairwise non-linear
Empirical validation	Human correlation $\sim$ 0.57	ROC-AUC $\sim$ 0.92

7. Limitations and Extensions

Static Influence Matrix: $\Gamma$ is currently SBERT-derived and fixed; it does not adapt to real-world co-occurrence frequencies in model outputs. Future work could fit $\Gamma$ from annotated multi-risk data or make it context-dependent.

Scenario Specification: Only a single scenario is handled per evaluation by selecting one row of $\Gamma$ . Real-world cases where multiple scenarios co-trigger could involve mixtures or full $\Gamma R(x)$ products.

Modal Independence: MCRS computes per-modality risk independently. Future extensions could model joint structure via higher-order tensors or cross-modal influence matrices.

Interaction Depth: While OutSafe-Bench MCRS uses only first-order linear mixing, RiskRank supports k-additive Choquet integrals, allowing for higher-order joint risk aggregation.

A plausible implication is that future generalizations of MCRS may leverage dynamic, learned influence matrices and higher-order interactions to further enhance fidelity to complex, multimodal risk landscapes.

References

OutSafe-Bench: "OutSafe-Bench: A Benchmark for Multimodal Offensive Content Detection in LLMs" (Yan et al., 13 Nov 2025)
RiskRank: "RiskRank: Measuring interconnected risk" (Mezei et al., 2016)

PDF Markdown Chat (Pro)

References (2)

OutSafe-Bench: A Benchmark for Multimodal Offensive Content Detection in Large Language Models (2025)

RiskRank: Measuring interconnected risk (2016)

Follow Topic

Get notified by email when new papers are published related to Multidimensional Cross Risk Score (MCRS).