Correlation Impact Ratio (CIR)
- CIR is a correlation-aware feature attribution score that measures sign-aligned co-movement between features and model outputs after robust centering.
- It uses a single-pass, sub-sampling methodology and quantile-based centering to ensure scalability and efficiency in streaming and edge-deployment scenarios.
- BlockCIR extends the approach to groups by mitigating double-counting in highly correlated feature clusters, yielding stable and interpretable global rankings.
The Correlation Impact Ratio (CIR) is a correlation-aware feature attribution score designed for explainable AI (XAI) in complex models and large or heterogeneous datasets. CIR quantifies the sign-aligned co-movement between features and model outputs after robust centering, delivering single-pass, scalable, and computationally efficient global explainability. The ExCIR framework introduces rigorous invariance and stability properties while extending naturally to groupwise attributions via BlockCIR, which addresses double-counting in highly correlated feature clusters. CIR's design allows lightweight transfer protocols, recreating full-model rankings with a fraction of data, making it suited to edge, streaming, and real-world deployment scenarios (Sengupta et al., 20 Nov 2025).
1. Formal Definition and Mathematical Construction
Let be the observed feature matrix (: samples, : features), with feature columns . Let be the vector of model outputs (e.g., logits, predictions). Both inputs and outputs are robustly centered using the mid-mean operator: where denotes the empirical -quantile of . The robustly centered values are
For a feature set , define per-sample quantities: Aggregate across samples to get the total signed co-movement () and co-movement mass (): The Correlation Impact Ratio for set is then
For individual features, (Sengupta et al., 20 Nov 2025).
2. Theoretical Properties and Interpretation
CIR measures the fraction of co-movement mass that is sign-consistent; it can equivalently be written as
where . Key invariance properties include:
- Translation and positive-scale invariance: Additive or positive multiplicative constants to or do not affect CIR due to centering and self-cancellation.
- Sign symmetry: Flipping or complements the score, as but is invariant.
- Monotonicity: Increasing the magnitude of aligned co-movement or reducing anti-aligned terms increases CIR.
CIR down-weights features with co-movement that fluctuates in sign, while rewarding consistently aligned (positive/negative) associations between features and outputs, even under feature correlation. The construction ensures post-hoc, single-pass computation, requiring only the model's outputs and feature data after training (Sengupta et al., 20 Nov 2025).
3. Algorithmic Procedure and Complexity
CIR and its group extension are computed using a lightweight single-pass protocol, particularly advantageous when is large or in streaming/edge scenarios:
- Random sub-sampling: Select a random subset of rows (), keeping the model, hyperparameters, seed, and validation split fixed. With –$0.4$, the method empirically recovers global ranking structure at $3$– speed-up.
- Computation:
- Quantile-based centering: (streaming) or (sort-based).
- Aggregation: time, space for single features, and for groupwise collections.
Pseudocode Outline (LaTeX-style):
- Compute robust centers for features and outputs on the sampled rows.
- Accumulate , (and optionally , for groups).
- Compute as or $1/2$ if .
This protocol permits reproducibility and transferability with partial data, compatible with quantile sketches for streaming (Sengupta et al., 20 Nov 2025).
4. Groupwise Extension: BlockCIR and Double-Counting Mitigation
BlockCIR generalizes ExCIR to collective feature attribution for predefined or data-driven sets : BlockCIR aggregates aligned co-movement over sets, mitigating the double-counting effect present in collinear or redundant groups (e.g., synonyms, duplicated sensors, highly correlated gene clusters). Group construction may be:
- Domain-driven: using prior taxonomies such as medical code groupings or sensor channels.
- Data-driven: via hierarchical clustering or correlation thresholding to extract feature clusters.
- Model-driven: based on learned embeddings, heads, or structure-specific representations.
By scoring groups as single units, BlockCIR provides more stable, interpretable global rankings in strongly correlated feature regimes (Sengupta et al., 20 Nov 2025).
5. Empirical Evaluation and Comparative Results
Across 29 benchmark tasks (text, tabular, image, signal, remote sensing, and synthetic datasets using logistic regression and XGBoost backbones), ExCIR delivers:
- High agreement with established global attribution baselines regarding top- feature overlap (default ), with retention at –$0.4$ sub-sampling.
- Runtime reductions: $3$– wall-clock improvement (e.g., 18.3s 4.1s for har6 at ).
- Score robustness: High Spearman () between mid-mean and median centering; mean centering degrades under outliers.
- BlockCIR effectiveness: Preserves top-marked features in collinear blocks, increases top- overlap, and avoids diluting importance across co-moving features.
Evaluation metrics include the Jaccard overlap, Spearman’s , Kendall’s , Orthogonal Procrustes residual, and symmetric KL divergence. Compared to SHAP/LIME/PFI, CIR/BlockCIR forgo model perturbations, scales linearly, and respects feature correlation structure—properties lacking in gradient-based or kernel-based (HSIC, MI) alternatives (Sengupta et al., 20 Nov 2025).
| Baseline | Perturbation-Free | Correlation-Aware | Linear Scalability |
|---|---|---|---|
| ExCIR/BlockCIR | Yes | Yes | Yes |
| SHAP/LIME/PFI | No | No | No |
| Gradients | Yes | No | Yes |
| Kernel-based (HSIC) | Yes | No | No |
6. Applications, Limitations, and Open Problems
CIR and BlockCIR address global feature ranking in large tabular/text corpora, streaming scenarios with quantile sketches, and group-level attribution (multi-sensor, multi-channel) across vision and NLP (via class-conditioned extensions ).
Limitations:
- CIR captures correlation, not causality; latent confounding can affect interpretations.
- It is less sensitive to nonlinear higher-order feature–output dependencies; features only interacting nonlinearly (e.g., sinusoidal relationships) yield near-zero CIR.
- In extremely heavy-tailed noise, robust centering may be insufficient, requiring further trimming or robustification.
Open Directions:
- Conditional ExCIR (cCIR): Isolating the unique effect of features after accounting for others.
- Mutual-ExCIR (mCIR): Extending to capture nonlinear or higher-order interactions by, for example, kernelizing co-movements.
- Adaptive grouping: Learning optimal feature-set partitions via graph or attention-based approaches.
- Sample complexity theory: Establishing the number of samples required to reach a target top- agreement under sub-sampling (Sengupta et al., 20 Nov 2025).
CIR and BlockCIR provide correlation-aware, efficient, and robust global explainability suitable for large-scale deployment and resource-constrained settings, complementing but not replacing methods sensitive to causal and nonlinear feature effects.