Consistency Ratio (CR) in AHP

Updated 26 December 2025

Consistency Ratio (CR) is a normalized metric that quantifies the deviation from perfect consistency in pairwise comparison matrices of the Analytic Hierarchy Process (AHP).
It establishes clear acceptance thresholds—CR < 0.05 is very good, 0.05 ≤ CR < 0.10 is acceptable, and CR ≥ 0.10 signals unacceptable inconsistency—guiding decision revisions.
Recent evaluations reveal CR’s limitations, such as false negatives and order bias, prompting the development of optimization models and alternative triadic measures.

The Consistency Ratio (CR) is the most widely adopted quantitative index for measuring the deviation from perfect consistency in pairwise comparison matrices (PCMs) of the Analytic Hierarchy Process (AHP). Introduced by Saaty, CR provides a normalized metric for inconsistency by benchmarking a matrix’s principal eigenvalue against expected values derived from random matrices, supporting acceptance/rejection thresholds in multi-criteria decision analysis. The theoretical foundation, practical mechanics, and empirical performance of CR, as well as critical limitations and recent alternatives, are central to the evaluation and improvement of decision support systems utilizing PCMs.

1. Formal Definition and Mathematical Properties

Given an $n \times n$ positive reciprocal PCM $A = (a_{ij})$ , the principal right eigenvalue $\lambda_{\max}(A)$ is computed. Two key indices are defined:

Consistency Index ( $\mathrm{CI}$ ):

$\mathrm{CI}(A) = \frac{\lambda_{\max}(A) - n}{n - 1}$

Consistency Ratio ( $\mathrm{CR}$ ):

$\mathrm{CR}(A) = \frac{\mathrm{CI}(A)}{\mathrm{RI}_n}$

$\mathrm{RI}_n$ is the Random Index: the mean $\mathrm{CI}$ over large samples of random reciprocal matrices of order $n$ , using entries drawn from the canonical AHP scale (typically $\{1/9, 1/8, \ldots, 9\}$ ).

For all $n$ , a perfectly consistent matrix satisfies $\lambda_{\max} = n$ , so $\mathrm{CR} = 0$ .

Typical Random Index (RI) Values

n	RI
1	0.00
2	0.00
3	0.58
4	0.90
5	1.12
6	1.24
7	1.32
8	1.41
9	1.45
10	1.49

2. Interpretation Benchmarks and Acceptance Thresholds

Standard practice, justified by Saaty’s original simulations and subsequent refinements, establishes specific CR acceptance thresholds:

$\mathrm{CR} < 0.05$ : “very good” consistency
$0.05 \leq \mathrm{CR} < 0.10$ : “acceptable”
$\mathrm{CR} \geq 0.10$ : “unacceptably inconsistent”; revision of judgments is recommended

The de facto threshold for practical AHP implementations is $\mathrm{CR} \leq 0.10$ , known as the “ten-percent rule” (Bose, 7 May 2025, 1311.0748).

3. Relationships to Other Inconsistency Indices

CR is a global index, reflecting overall transitivity violation through the lens of the Perron eigenvalue. Two prominent local triad-based measures offer alternative perspectives:

Koczkodaj–Duszak’s CM (maximum triad inconsistency):

$\mathrm{CM}(A) = \max_{i<j<k} \min\left\{|1-\frac{a_{jk}a_{ik}}{a_{ij}}|,\;|1-\frac{a_{ij}}{a_{jk}a_{ik}}|\right\}$

CM reflects the single worst triad violation.

Peláez–Lamata’s CI (average determinant of triads):

$\mathrm{CI}_{PL}(A) = \frac{1}{N_3}\sum_{i<j<k}\det([a_{pq}]_{p,q\in\{i,j,k\}})$

where $N_3$ is the number of triads in the matrix. This measure vanishes precisely on consistent triads.

Both $\mathrm{CI}(A)$ (from CR) and $\mathrm{CI}_{PL}(A)$ are convex in the log-space representation $x_{ij} = \log a_{ij}$ , whereas $\mathrm{CM}(A)$ is quasi-convex but can be transformed by $f(t) = t/(1 - t)$ for convexity. A plausible implication is that CR’s global nature may overlook severe local violations, motivating further triad-based analyses (1311.0748).

4. Reducing Inconsistency via Optimization

Bozóki, Fülöp, and Poesz (1311.0748) introduce two mixed-integer nonlinear programming (MINLP) models to support real-time reduction of CR in PCMs:

Model A (Minimal Correction):

Objective: minimize the number of modified entries $y_{ij}$ to achieve $\mathrm{CR}(\tilde{A}) \leq \alpha$
Constraints:
- Skew-symmetry: $x_{ji} = -x_{ij}$
- Bound on modifications: $|x_{ij} - d_{ij}|$ controlled by binary variables $y_{ij}$
- Perron root constraint via variational characterization:
$\sum_{j=1}^n \exp(x_{ij} + z_j - z_i) \leq a^* = n+(n-1)\mathrm{RI}_n \cdot \alpha, \quad \forall i$
Only the smallest necessary set of entries is flagged for revision, rather than re-eliciting all $n(n-1)/2$ judgments.

Model B (Budgeted Correction):

Objective: minimize $\mathrm{CR}$ given a ceiling $K$ on the number of entry modifications
Similar set of variables and constraints; enforces $\sum y_{ij} \leq K$

Solution is enabled by convexity in log-space, off-the-shelf MI-convex solvers (CPLEX, Gurobi), and practical tractability up to $n \approx 10$ –$15$.

5. Empirical Performance and Limitations of CR

Recent systematic evaluations (Bose, 7 May 2025) highlight notable deficiencies of CR when applied to human-elicited (“logical”) PCMs:

Low accuracy: On simulated “logical” PCMs of order 4–12, CR achieves only $\approx 50\%$ correct classification (consistent/inconsistent).
False negatives: $5.5\%$ of actually inconsistent matrices are classified as consistent.
Severe order bias: For $n \geq 8$ , fewer than $5\%$ of logical PCMs pass, despite unsupervised clustering suggesting $>50\%$ should qualify as consistent.
Extremely low specificity: Systematically under-identifies consistent matrices, erring on over-rejection.

Benchmarking against the triadic preference reversal (PR) method yields:

Metric	CR method	PR method
Accuracy	50%	97%
Sensitivity	~100%	96%
Specificity	~10%	97%
False Neg.	5.5%	2.6%

Order	Ab-initio	CR method	PR method
4	72.9%	28.1%	74.8%
6	59.0%	12.6%	59.6%
8	56.9%	4.1%	54.0%
Overall	58.2%	8.5%	58.4%

This suggests the CR framework’s reliance on random matrix benchmarks and global eigenvalue aggregation systematically over-penalizes genuine but non-ideal human input.

6. Decision Support Implications and Contemporary Critique

The rigidity of CR thresholds and over-rejection of moderate inconsistencies have several operational consequences (Bose, 7 May 2025, 1311.0748):

Workflow disruption: Insisting on $\mathrm{CR} < 0.1$ leads to unnecessary re-elicitation, eroding user trust and distorting the priority structure toward “artificially smooth” outcomes.
Order dependence: The probability of passing the CR test decreases sharply with higher- $n$ , rendering the metric problematic in complex decision settings.
Practical resolution: Optimization-based frameworks (e.g., Model A/B above) can target only the most influential judgments, offering pragmatic corrections and minimizing user burden.
Future alternatives: Methods based on triadic preference reversals and clustering (e.g., PR method) achieve higher empirical fidelity to “ground truth” consistency, aligning with human intuitive classifications.

A critical consensus is emerging that, while CR’s mathematical construction is elegant and directly interpretable via the eigenvector weighting method, it generates significant Type I and II errors in practice. Alternatives designed to match local, interpretable preference conflicts yield substantially improved reliability and are central to current methodological innovation in consistency assessment (Bose, 7 May 2025).