Papers
Topics
Authors
Recent
Search
2000 character limit reached

Cross-Modal Safety Consistency (CMSC-score)

Updated 28 January 2026
  • The paper introduces CMSC-score to quantify cross-modal safety consistency through an exponential decay function applied to the standard deviation of modality-specific safety scores.
  • Methodology involves aggregating safety scores from 24 modality sub-categories using parallelized prompts and LLM-based comprehension filtering to ensure valid comparisons.
  • Empirical findings on OLLMs show CMSC-scores ranging from 0.42 to 0.83, revealing cross-modal vulnerabilities and emphasizing the need for uniform safety performance.

The Cross-Modal Safety Consistency Score (CMSC-score) is a quantitative metric introduced to assess the consistency of safety performance in Omni-modal LLMs (OLLMs) across a spectrum of input modalities, including text, audio, image, and their combinations. CMSC-score is designed to reveal whether an OLLM's refusal and safe-content behaviors are uniform when the same semantic prompt is delivered through different sensory inputs. This metric is a core contribution of Omni-SafetyBench, the first large-scale benchmark explicitly constructed for evaluating OLLM safety under joint modality transformations, where traditional single-modal safety metrics often fail to capture cross-modal vulnerabilities (Pan et al., 10 Aug 2025).

1. Formal Definition and Mathematical Formulation

The CMSC-score is predicated on collecting Safety-scores, denoted sis_i, for NN modality sub-categories derived from parallelizing benchmark prompts. The computation proceeds as follows:

  • Compute the mean Safety-score:

μ=1Ni=1Nsi\mu = \frac{1}{N}\sum_{i=1}^N s_i

  • Compute the standard deviation:

σ=1Ni=1N(siμ)2\sigma = \sqrt{\frac{1}{N}\sum_{i=1}^N (s_i - \mu)^2}

  • Transform the dispersion into a consistency score with:

CMSC-score=eασ\text{CMSC-score} = e^{-\alpha \sigma}

where α\alpha is a user-defined sensitivity hyperparameter (set to 5 in (Pan et al., 10 Aug 2025)). Each si[0,1]s_i\in [0,1] by design, with N=24N=24 in the Omni-SafetyBench setting.

CMSC-score lies in the interval (0,1](0,1], with a value of 1 indicating perfect cross-modal consistency (σ=0\sigma=0). As the divergence in per-modality Safety-scores increases, CMSC-score decays exponentially toward zero.

2. Component Terms and Metric Rationale

  • sis_i (Safety-score): The OLLM’s safety metric for the ii-th modality, itself a conditional function of the conditional Attack Success Rate (C-ASR) and conditional Refusal Rate (C-RR), as per Equation (2) in (Pan et al., 10 Aug 2025).
  • μ\mu (Mean): Represents the central tendency of safety across modalities.
  • σ\sigma (Standard Deviation): Measures fluctuations in safety performance and is the critical determinant for penalizing inconsistency.
  • α\alpha (Sensitivity): Controls the penalty gradient; empirically set to 5 to ensure meaningful discriminative power across the observed dispersion range.
  • The exponential mapping eασe^{-\alpha \sigma} enforces a sharp penalty for even moderate inconsistency, focusing evaluative pressure on uniform safety irrespective of overall level.

The rationale is to avoid a metric that rewards models for selectively high safety in a single modality if that same model is arbitrarily vulnerable in others.

3. Methodology and Benchmarking Protocol

The computation of CMSC-score is embedded within the Omni-SafetyBench evaluation pipeline as follows:

  1. Prompt Parallelization: Seed prompts are instantiated into 24 distinct modality inputs (text, image, audio, and all pairwise/tri-modal combinations).
  2. Comprehension Filtering: Each input’s comprehension is judged by an LLM-based judge (Qwen-Plus). Inputs not understood by the target OLLM are excluded from safety aggregation, as misunderstanding can artifically inflate safety metrics.
  3. Safety Labeling:

    • Responses to understood inputs are annotated for harm (unsafe) and refusals.
    • C-ASR and C-RR are calculated by conditioning on the “understood” subset.
    • The Safety-score is then synthesized as:

    Safety-score=(1C-ASR)(1+λC-RR)1+λ,λ=0.5\text{Safety-score} = \frac{(1-\text{C-ASR})(1+\lambda\,\text{C-RR})}{1+\lambda}, \quad \lambda=0.5

  4. Aggregation and CMSC-score Computation: Safety-scores s1,,sNs_1,\ldots,s_N are aggregated for modalities meeting a minimum comprehension threshold (understand rate \geq20%). μ\mu and σ\sigma are computed on this set, and the final CMSC-score is produced via eασe^{-\alpha \sigma}.

A minimum of two modalities is required after filtering to yield a meaningful σ\sigma.

4. Key Empirical Findings and Illustrative Example

Empirical analysis on ten OLLMs using Omni-SafetyBench yields CMSC-scores across a typical range of 0.42 to 0.83. Notable results from Table 7 of (Pan et al., 10 Aug 2025) include:

Model Name CMSC-score
gemini-2.5-pro 0.83
gemini-2.5-pro-preview 0.80
Unified-IO2-xxlarge 0.81
Qwen2.5-omni-7b 0.74
Baichuan-omni-1.5 0.56
Minicpm-o-2.6 0.42

A high CMSC-score signals that an OLLM’s refusal and safe-content output tendencies are stable across text, audio, image, and joint modalities. Conversely, lower scores highlight cross-modal vulnerabilities even when other safety metrics may appear satisfactory.

Illustrative Example: If a model achieves s1=0.80s_1=0.80 (text), s2=0.60s_2=0.60 (image), and s3=0.70s_3=0.70 (audio), then

μ=0.70, σ0.082, CMSC0.66.\mu=0.70,\ \sigma\approx 0.082,\ \text{CMSC}\approx 0.66.

This reflects moderate cross-modal consistency.

5. Assumptions, Edge Cases, and Metric Constraints

  • Score Range Enforcement: All sis_i are bounded within [0,1][0,1]; μ\mu and the empirical σ\sigma are correspondingly constrained.
  • Filtering of Low-Comprehension Modalities: Any modality with an “understand rate” <20%<20\% is excluded from aggregation to prevent models with persistent “I don't know” behaviors from attaining artificial consistency.
  • Missing Data: Modalities lacking sufficient comprehension are omitted; at least two must remain.
  • Perfect Consistency: σ=0\sigma=0 yields CMSC-score=1\text{CMSC-score}=1.
  • Hyperparameter α\alpha: Chosen heuristically; modifies the functional sensitivity but does not influence the score’s theoretical range.

A plausible implication is that practitioners must interpret CMSC-score in conjunction with average Safety-score, as uniform low safety yields high CMSC but is undesirable in practice.

6. Limitations and Potential Extensions

  • Dispersion-Only Focus: CMSC-score exclusively measures standard deviation of Safety-scores. Uniformly poor safety (e.g., si0.2s_i \approx 0.2 across all modalities) produces high CMSC. Joint reporting with the mean Safety-score is essential for a comprehensive assessment.
  • Heuristic Sensitivity Parameter (α\alpha): The current setting (5) is not data-driven. Alternative parameter selection strategies may improve fidelity and discriminative utility.
  • Outlier Susceptibility: Single outliers among modalities can dominate σ\sigma, potentially skewing consistency evaluation. Robust statistical alternatives, such as median absolute deviation, are suggested as possible refinements.
  • Modality Weighting: The current formulation assumes independence and equal importance across modality sub-categories. Practical applications may warrant differential weighting, especially for more security-critical modalities (e.g., audio-visual joint attacks).
  • Metric Integration: Future research could consider embedding cross-modal consistency within the Safety-score itself, or devising joint metrics that penalize both low mean safety and high dispersion simultaneously.

7. Significance in OLLM Safety Research

CMSC-score provides a systematic and quantitative framework for identifying models exhibiting inconsistent refusal or safety-regulation behaviors across modality transformations. Its introduction addresses a key shortcoming in prior LLM safety evaluations, which rarely tested models on the same content delivered through different modalities or exposed multi-modal prompt-specific vulnerabilities. The metric is instrumental in:

  • Highlighting models that, despite achieving high safety on canonical (text) benchmarks, falter under audio, visual, or compound modality transformations.
  • Informing the development of evaluation methodologies and model architectures attuned to omni-modal robustness.
  • Providing an actionable analytic for comparing OLLMs, interpreting cross-modal safety trade-offs, and spotlighting areas necessitating targeted safety intervention (Pan et al., 10 Aug 2025).

The adoption of CMSC-score establishes a foundational baseline for systematic cross-modal safety auditing in OLLMs and is poised to influence future paradigm design and regulatory benchmarking within the field.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Cross-Modal Safety Consistency Score (CMSC-score).