Cross-Modal Safety Consistency (CMSC-score)

Updated 28 January 2026

The paper introduces CMSC-score to quantify cross-modal safety consistency through an exponential decay function applied to the standard deviation of modality-specific safety scores.
Methodology involves aggregating safety scores from 24 modality sub-categories using parallelized prompts and LLM-based comprehension filtering to ensure valid comparisons.
Empirical findings on OLLMs show CMSC-scores ranging from 0.42 to 0.83, revealing cross-modal vulnerabilities and emphasizing the need for uniform safety performance.

The Cross-Modal Safety Consistency Score (CMSC-score) is a quantitative metric introduced to assess the consistency of safety performance in Omni-modal LLMs (OLLMs) across a spectrum of input modalities, including text, audio, image, and their combinations. CMSC-score is designed to reveal whether an OLLM's refusal and safe-content behaviors are uniform when the same semantic prompt is delivered through different sensory inputs. This metric is a core contribution of Omni-SafetyBench, the first large-scale benchmark explicitly constructed for evaluating OLLM safety under joint modality transformations, where traditional single-modal safety metrics often fail to capture cross-modal vulnerabilities (Pan et al., 10 Aug 2025).

1. Formal Definition and Mathematical Formulation

The CMSC-score is predicated on collecting Safety-scores, denoted $s_i$ , for $N$ modality sub-categories derived from parallelizing benchmark prompts. The computation proceeds as follows:

Compute the mean Safety-score:

$\mu = \frac{1}{N}\sum_{i=1}^N s_i$

Compute the standard deviation:

$\sigma = \sqrt{\frac{1}{N}\sum_{i=1}^N (s_i - \mu)^2}$

Transform the dispersion into a consistency score with:

$\text{CMSC-score} = e^{-\alpha \sigma}$

where $\alpha$ is a user-defined sensitivity hyperparameter (set to 5 in (Pan et al., 10 Aug 2025)). Each $s_i\in [0,1]$ by design, with $N=24$ in the Omni-SafetyBench setting.

CMSC-score lies in the interval $(0,1]$ , with a value of 1 indicating perfect cross-modal consistency ( $\sigma=0$ ). As the divergence in per-modality Safety-scores increases, CMSC-score decays exponentially toward zero.

2. Component Terms and Metric Rationale

$s_i$ (Safety-score): The OLLM’s safety metric for the $i$ -th modality, itself a conditional function of the conditional Attack Success Rate (C-ASR) and conditional Refusal Rate (C-RR), as per Equation (2) in (Pan et al., 10 Aug 2025).
$\mu$ (Mean): Represents the central tendency of safety across modalities.
$\sigma$ (Standard Deviation): Measures fluctuations in safety performance and is the critical determinant for penalizing inconsistency.
$\alpha$ (Sensitivity): Controls the penalty gradient; empirically set to 5 to ensure meaningful discriminative power across the observed dispersion range.
The exponential mapping $e^{-\alpha \sigma}$ enforces a sharp penalty for even moderate inconsistency, focusing evaluative pressure on uniform safety irrespective of overall level.

The rationale is to avoid a metric that rewards models for selectively high safety in a single modality if that same model is arbitrarily vulnerable in others.

3. Methodology and Benchmarking Protocol

The computation of CMSC-score is embedded within the Omni-SafetyBench evaluation pipeline as follows:

Prompt Parallelization: Seed prompts are instantiated into 24 distinct modality inputs (text, image, audio, and all pairwise/tri-modal combinations).
Comprehension Filtering: Each input’s comprehension is judged by an LLM-based judge (Qwen-Plus). Inputs not understood by the target OLLM are excluded from safety aggregation, as misunderstanding can artifically inflate safety metrics.
Safety Labeling:
- Responses to understood inputs are annotated for harm (unsafe) and refusals.
- C-ASR and C-RR are calculated by conditioning on the “understood” subset.
- The Safety-score is then synthesized as:
$\text{Safety-score} = \frac{(1-\text{C-ASR})(1+\lambda\,\text{C-RR})}{1+\lambda}, \quad \lambda=0.5$
Aggregation and CMSC-score Computation: Safety-scores $s_1,\ldots,s_N$ are aggregated for modalities meeting a minimum comprehension threshold (understand rate $\geq$ 20%). $\mu$ and $\sigma$ are computed on this set, and the final CMSC-score is produced via $e^{-\alpha \sigma}$ .

A minimum of two modalities is required after filtering to yield a meaningful $\sigma$ .

4. Key Empirical Findings and Illustrative Example

Empirical analysis on ten OLLMs using Omni-SafetyBench yields CMSC-scores across a typical range of 0.42 to 0.83. Notable results from Table 7 of (Pan et al., 10 Aug 2025) include:

Model Name	CMSC-score
gemini-2.5-pro	0.83
gemini-2.5-pro-preview	0.80
Unified-IO2-xxlarge	0.81
Qwen2.5-omni-7b	0.74
Baichuan-omni-1.5	0.56
Minicpm-o-2.6	0.42

A high CMSC-score signals that an OLLM’s refusal and safe-content output tendencies are stable across text, audio, image, and joint modalities. Conversely, lower scores highlight cross-modal vulnerabilities even when other safety metrics may appear satisfactory.

Illustrative Example: If a model achieves $s_1=0.80$ (text), $s_2=0.60$ (image), and $s_3=0.70$ (audio), then

$\mu=0.70,\ \sigma\approx 0.082,\ \text{CMSC}\approx 0.66.$

This reflects moderate cross-modal consistency.

5. Assumptions, Edge Cases, and Metric Constraints

Score Range Enforcement: All $s_i$ are bounded within $[0,1]$ ; $\mu$ and the empirical $\sigma$ are correspondingly constrained.
Filtering of Low-Comprehension Modalities: Any modality with an “understand rate” $<20\%$ is excluded from aggregation to prevent models with persistent “I don't know” behaviors from attaining artificial consistency.
Missing Data: Modalities lacking sufficient comprehension are omitted; at least two must remain.
Perfect Consistency: $\sigma=0$ yields $\text{CMSC-score}=1$ .
Hyperparameter $\alpha$ : Chosen heuristically; modifies the functional sensitivity but does not influence the score’s theoretical range.

A plausible implication is that practitioners must interpret CMSC-score in conjunction with average Safety-score, as uniform low safety yields high CMSC but is undesirable in practice.

6. Limitations and Potential Extensions

Dispersion-Only Focus: CMSC-score exclusively measures standard deviation of Safety-scores. Uniformly poor safety (e.g., $s_i \approx 0.2$ across all modalities) produces high CMSC. Joint reporting with the mean Safety-score is essential for a comprehensive assessment.
Heuristic Sensitivity Parameter ( $\alpha$ ): The current setting (5) is not data-driven. Alternative parameter selection strategies may improve fidelity and discriminative utility.
Outlier Susceptibility: Single outliers among modalities can dominate $\sigma$ , potentially skewing consistency evaluation. Robust statistical alternatives, such as median absolute deviation, are suggested as possible refinements.
Modality Weighting: The current formulation assumes independence and equal importance across modality sub-categories. Practical applications may warrant differential weighting, especially for more security-critical modalities (e.g., audio-visual joint attacks).
Metric Integration: Future research could consider embedding cross-modal consistency within the Safety-score itself, or devising joint metrics that penalize both low mean safety and high dispersion simultaneously.

7. Significance in OLLM Safety Research

CMSC-score provides a systematic and quantitative framework for identifying models exhibiting inconsistent refusal or safety-regulation behaviors across modality transformations. Its introduction addresses a key shortcoming in prior LLM safety evaluations, which rarely tested models on the same content delivered through different modalities or exposed multi-modal prompt-specific vulnerabilities. The metric is instrumental in:

Highlighting models that, despite achieving high safety on canonical (text) benchmarks, falter under audio, visual, or compound modality transformations.
Informing the development of evaluation methodologies and model architectures attuned to omni-modal robustness.
Providing an actionable analytic for comparing OLLMs, interpreting cross-modal safety trade-offs, and spotlighting areas necessitating targeted safety intervention (Pan et al., 10 Aug 2025).

The adoption of CMSC-score establishes a foundational baseline for systematic cross-modal safety auditing in OLLMs and is poised to influence future paradigm design and regulatory benchmarking within the field.

Markdown Report Issue Upgrade to Chat

References (1)

Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models (2025)

Topic to Video (Beta)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Cross-Modal Safety Consistency Score (CMSC-score).