High-Dimensional Consistency Criteria

Updated 16 May 2026

High-dimensional consistency criteria are formal frameworks that ensure estimators, tests, and model selection methods converge correctly even when dimensions outpace sample size.
They balance key trade-offs among model dimension, structural sparsity, signal strength, and design geometry to maintain robust inference in varied high-dimensional regimes.
These criteria underpin advanced methodologies in genomics, machine learning, and statistics by mandating conditions like weak dependence and eigenvalue control for reliable asymptotic performance.

High-dimensional consistency criteria formalize the precise conditions under which estimators, tests, and model selection procedures exhibit desirable asymptotic properties—such as convergence to the truth or correct model—when the dimension of the parameter space grows polynomially, exponentially, or even faster with the sample size. These criteria are now central to the modern analysis of inference, variable selection, risk estimation, and goodness-of-fit in settings ranging from sparse regression and graphical models to principal components and extremes. Their mathematical form varies by context, but in all cases involves delicate tradeoffs between model dimension, structural sparsity, signal strength, moment/tail behavior, and geometric properties of the design or covariance structure.

1. Definitions of High-Dimensional Consistency Criteria

High-dimensional consistency extends classical notions of estimator or model selection consistency to regimes where key parameters scale with or outpace sample size. Letting $d$ , $p$ , or $q$ denote dimension, typical relevant objects include:

Pointwise consistency: Probability or posterior mass concentrates on the true parameter as $n,p \to \infty$ .
Model selection consistency: Probability that a procedure selects the true model (e.g., support, graph, component set) tends to one as $n,p \to \infty$ .
Risk consistency: Error (e.g., out-of-sample prediction risk) of an estimator converges to the oracle risk in the high-dimensional regime.
Test consistency: Power of a test against all signals outside the null tends to one, uniformly over prescribed alternatives, in high dimensions.

These definitions are always relative to a specific asymptotic regime—examples include $p = o(n)$ , $p \sim n$ , $p \gg n$ , or $p,n \to \infty$ with $p/n \to c \in (0,\infty)$ .

2. Representative Criteria and Sufficient/Necessary Conditions

2.1 Consistency for Heritability Estimation

For summary statistic-based heritability estimators in high-dimensional linear models—e.g., LDSC and GWASH—the two necessary and sufficient conditions are:

Weak Dependence (WD): Control on the dependence across predictors:

$p$ 0

reflecting that correlations across predictors do not accumulate excessively as $p$ 1.

Bounded Kurtosis Effects (BKE): Uniform boundedness on the kurtosis of genetic effects:

$p$ 2

These conditions are minimal: both must hold for mean-square consistency of heritability and estimator convergence in probability. If either is violated, systematic bias or excess variance remains even asymptotically (Azriel et al., 16 Feb 2025).

2.2 Laplacian-Constrained Precision Estimation

For maximum-likelihood graph Laplacian estimation under high-dimensional settings ( $p$ 3):

Sample connectivity: The sample “difference graph” (edges where the empirical distance is positive) must be connected and contain the prescribed edge set.
Sub-Gaussianity: Columns are i.i.d. sub-Gaussian with bounded moments.

The $p$ 4 (symmetrized Stein) loss then achieves

$p$ 5

with no dependence on graph sparsity—Laplacian structure alone regularizes the problem (Pavez, 2021).

2.3 Bayesian Posterior and Model Selection Consistency

For Bayesian variable selection with nonlocal priors ( $p$ 6 potentially sub-exponential in $p$ 7):

Eigenvalue control: All submodel Gram matrices have eigenvalues in $p$ 8 (with $p$ 9).
True sparsity: True model size $q$ 0 is fixed or $q$ 1. For complexity priors, $q$ 2 allowed up to $q$ 3 for $q$ 4.
Prior-complexity control: Model-size prior decays at $q$ 5, $q$ 6.
Nonlocal prior order: Order $q$ 7 for $q$ 8.

Under these, posterior mass and posterior mode concentrate on the true model (Cao et al., 2017); deviations decay polynomially or exponentially according to likelihood-separation and prior penalty.

2.4 Information Criteria for Model Selection

For ICs like AIC/BIC and their high-dimensional analogs, conditions involve:

Penalty dominance: For BIC-type penalties, $q$ 9 or $n,p \to \infty$ 0 must outpace dimension dependence for overfitting to be penalized adequately.
Signal separation: Minimax gap conditions between true and spurious models, calibrated by covariance eigenvalue "spikes" for PCA/DA, or noncentrality matrices for multivariate regression/variable screening.

Explicitly, for the spiked covariance in high-dimensional extremes, AIC is consistent iff the signal–noise gap

$n,p \to \infty$ 1

with $n,p \to \infty$ 2 and $n,p \to \infty$ 3 the $n,p \to \infty$ 4th spike (Butsch et al., 28 May 2025).

2.5 High-dimensional Testing

In high-dimensional Gaussian testing, the consistency set for $n,p \to \infty$ 5-norm based tests is

$n,p \to \infty$ 6

with $n,p \to \infty$ 7 ( $n,p \to \infty$ 8), and no test can substantially enlarge the consistency set (relative volume) beyond that of the Euclidean norm test (Kock et al., 2021, Kock et al., 2021).

2.6 Variable Screening and Selection

For strong screening in high dimensions:

Restricted Diagonal Dominance (RDD):

The screening matrix $n,p \to \infty$ 9 is RDD( $n,p \to \infty$ 0) if

$n,p \to \infty$ 1

for all $n,p \to \infty$ 2.

Equivalence to the Irrepresentable Condition: For $n,p \to \infty$ 3, RDD collapses to IC.

SIS and HOLP screening methods achieve strong consistency whenever RDD holds globally and signal/noise and sparsity allow $n,p \to \infty$ 4 (Wang et al., 2015).

3. Rates, Limitations, and Structural Dependencies

The exact scaling of required sample size, penalization, or signal for consistency depends on the regime:

Setting	Criterion	Sample/Signal Scaling	Reference
Nonparametric sparse selection	$n,p \to \infty$ 5 (fixed $n,p \to \infty$ 6); $n,p \to \infty$ 7 (growing $n,p \to \infty$ 8)	$n,p \to \infty$ 9 interplay	(Comminges et al., 2011)
High-dim AIC/BIC	$p = o(n)$ 0 (AIC), $p = o(n)$ 1 ( $p = o(n)$ 2)	$p = o(n)$ 3 via explicit phase diagram	(Bai et al., 2018)
Graphical model selection	$p = o(n)$ 4 in EBIC penalty	$p = o(n)$ 5, $p = o(n)$ 6	(Barber et al., 2014)
Bayesian variable selection	$p = o(n)$ 7 for $p = o(n)$ 8, $p = o(n)$ 9	Prior tail + design eigenvalue + model complexity	(Cao et al., 2017, Bai et al., 2017)
Principal component extremes	Spiked eigenvalue separation	$p \sim n$ 0, $p \sim n$ 1, signal gap	(Butsch et al., 28 May 2025)

These results show that, in many regimes, the "phase transitions" for consistency hinge on tight relationships among the ambient, intrinsic, and sample dimensions.

4. Handling Robustness, Missing Data, and Model Misspecification

4.1 Robust Divergence Criteria

Model evaluation criteria based on robust divergence (e.g., BHHJ-divergence) with “high-dimensional” penalties $p \sim n$ 2 achieve consistency under $p \sim n$ 3, $p \sim n$ 4. Key is that the divergence down-weights outliers, giving bounded influence (robustness) and high-dimensional selection consistency (Kurata et al., 2024).

4.2 Missing Data and Imputation-Consistency

The IC/ICC algorithms alternate imputation (draws under current parameter) and a consistency step (maximize Kullback-Leibler risk on pseudo-complete data). Under uniform law of large numbers, contraction mappings, and sub-exponential moment control, the averaged estimator converges at rate $p \sim n$ 5 (Liang et al., 2018).

4.3 Cross-Validation Consistency

Properly aligned leave- $p \sim n$ 6-out cross-validation achieves “restricted model selection consistency” in high dimensions, provided the restricted candidate set contains the true model, the minimal signal strength exceeds $p \sim n$ 7, and the path size $p \sim n$ 8 is subexponential in $p \sim n$ 9 (Feng et al., 2013).

5. Synthesis and Broader Methodological Principles

Certain commonalities underpin nearly all high-dimensional consistency criteria:

Separation of Signal and Noise: Consistency requires that signal strength (relative to sample size and complexity) dominate any penalty induced by overfitting, multiple comparisons, or regularization.
Structural Regularity: Independence, incoherence, restricted eigenvalue, weak dependence, or moment controls are essential to avoid bias accumulation across dimensions.
Adaptivity: Effective procedures (e.g., grid-aggregated tests, robust information criteria, adaptive search restrictions) achieve uniform consistency across a spectrum of sparsity/tail regimes.
Optimality Boundaries: Several results establish phase-transition boundaries and tight thresholds—e.g., $p \gg n$ 0, $p \gg n$ 1, penalty vs. signal gap—beyond which consistency is impossible regardless of estimator (Comminges et al., 2011, Bai et al., 2018).
Impossibility Results: In testing, no procedure can enlarge the (relative volume of the) consistency set beyond the Euclidean norm test in high dimensions (Kock et al., 2021).

The increasing sophistication of high-dimensional consistency theory enables sharper, more nuanced guidance across a wide range of problems in genomics, machine learning, signal processing, and the physical sciences. As new methodologies are proposed, precise high-dimensional consistency criteria remain the gold standard for evaluating their asymptotic validity and robustness.