LOCO CRT: Conditional Randomization Test

Updated 8 March 2026

LOCO–CRT is a method for assessing conditional independence in the model-X framework by evaluating the impact of omitting individual covariates.
It computes predictive loss differences between full and leave-one-out models to yield robust, finite-sample valid p-values for each variable.
The approach reduces computational cost through model reuse and supports analytic solutions in Gaussian covariate cases, enhancing its practical applicability.

The leave-one-covariate-out conditional randomization test (LOCO–CRT) is a computationally efficient methodology for assessing conditional independence within the model-X framework. It tests the null hypothesis that a response variable $Y$ is independent of a given covariate $X_j$ conditional on all other covariates $X_{-j}$ , under the assumption that the marginal distribution of the covariate vector $X$ is known or can be accurately sampled. LOCO–CRT yields valid $p$ -values useful for error-rate control in variable selection, minimizing algorithmic randomness and enabling practical application even in high-dimensional settings (Katsevich et al., 2020).

1. Statistical Formulation and Model-X Framework

The setup assumes i.i.d. samples $(X_i, Y_i)$ , $i=1,\ldots,n$ , with $X_i = (X_{i1}, \dots, X_{ip}) \in \mathbb{R}^p$ and arbitrary $Y_i$ . The model-X assumption requires that the joint distribution $P_X$ of $X_j$ 0 is fully known or directly sampleable, while $X_j$ 1 remains unrestricted. For each variable $X_j$ 2, the test targets

$X_j$ 3

where $X_j$ 4 denotes all features except $X_j$ 5. This hypothesis asserts that, conditional on $X_j$ 6, $X_j$ 7 carries no information regarding $X_j$ 8.

2. LOCO Test Statistic

Given a loss function $X_j$ 9, e.g., squared loss for regression or logistic loss for classification, define:

$X_{-j}$ 0: any fitted predictor trained on $X_{-j}$ 1,
$X_{-j}$ 2: the same model class fitted using $X_{-j}$ 3.

The observed LOCO statistic for coordinate $X_{-j}$ 4 is: $X_{-j}$ 5 $X_{-j}$ 6 quantifies the change in predictive loss incurred by omitting $X_{-j}$ 7. Under $X_{-j}$ 8, $X_{-j}$ 9 is expected to be small; under the alternative, it should be large.

3. Algorithmic Procedure and P-value Calculation

The LOCO–CRT algorithm uses null randomization to obtain a valid $X$ 0-value for each variable. For each $X$ 1:

Fit both $X$ 2 (on $X$ 3) and $X$ 4 (on $X$ 5).
Compute $X$ 6 as the average loss difference.
For $X$ $X$ 7:
- For each $X$ 8, sample $X$ 9.
- Construct $p$ 0.
- Compute $p$ 1, the analogous test statistic using the null-resampled $p$ 2.
Calculate

$p$ 3

Because $p$ 4 are exchangeable under $p$ 5, $p$ 6 is valid in finite samples. For simultaneous inference across coordinates, conventional multiplicity corrections (Bonferroni, Holm, Benjamini–Hochberg) can be applied.

4. Theoretical Guarantees

The principal theoretical result is finite-sample validity of LOCO–CRT $p$ 7-values: $p$ 8 given correct randomization from $p$ 9 and i.i.d. sampling. Under the null, $(X_i, Y_i)$ 0 is a super-uniform $(X_i, Y_i)$ 1-value. Familywise error rate (FWER) can be controlled at level $(X_i, Y_i)$ 2 by rejecting all $(X_i, Y_i)$ 3 with $(X_i, Y_i)$ 4. The computational efficiency is achieved by fitting $(X_i, Y_i)$ 5 and $(X_i, Y_i)$ 6 only once per variable; null sampling changes only $(X_i, Y_i)$ 7, not the fitted models.

5. L1ME–CRT Variant for L1-regularized M-Estimators

For L1-regularized estimators (e.g., Lasso, elastic net), refitting after each variable exclusion is computationally intensive. The L1ME–CRT modification capitalizes on the empirical observation that, for coordinates $(X_i, Y_i)$ 8 with $(X_i, Y_i)$ 9 in the full-data fit, the cross-validated penalty parameter $i=1,\ldots,n$ 0 is typically stable after exclusion, under restricted eigenvalue and Lipschitz loss conditions: $i=1,\ldots,n$ 1 Hence, for the “inactive” set $i=1,\ldots,n$ 2, the same $i=1,\ldots,n$ 3 can be reused for $i=1,\ldots,n$ 4, obviating additional cross-validations. This reduces computational overhead to near the number of “active” variables, $i=1,\ldots,n$ 5 in sparse regimes.

6. Closed-form Solution in the Multivariate Gaussian Covariate Case

Assuming $i=1,\ldots,n$ 6, the conditional law

$i=1,\ldots,n$ 7

with $i=1,\ldots,n$ 8, enables analytic computation. With squared error loss and ordinary least squares,

$i=1,\ldots,n$ 9

where $X_i = (X_{i1}, \dots, X_{ip}) \in \mathbb{R}^p$ 0 and $X_i = (X_{i1}, \dots, X_{ip}) \in \mathbb{R}^p$ 1 are fitted values excluding and including $X_i = (X_{i1}, \dots, X_{ip}) \in \mathbb{R}^p$ 2. Under $X_i = (X_{i1}, \dots, X_{ip}) \in \mathbb{R}^p$ 3, $X_i = (X_{i1}, \dots, X_{ip}) \in \mathbb{R}^p$ 4 follows a $X_i = (X_{i1}, \dots, X_{ip}) \in \mathbb{R}^p$ 5 distribution, and the $X_i = (X_{i1}, \dots, X_{ip}) \in \mathbb{R}^p$ 6-value is given by

$X_i = (X_{i1}, \dots, X_{ip}) \in \mathbb{R}^p$ 7

recovering the classical partial $X_i = (X_{i1}, \dots, X_{ip}) \in \mathbb{R}^p$ 8-test or $X_i = (X_{i1}, \dots, X_{ip}) \in \mathbb{R}^p$ 9-test for a single coefficient in normal linear regression, with no need for Monte Carlo.

7. Computational and Practical Considerations

Let $Y_i$ 0 denote the cost of fitting a model and $Y_i$ 1 the cost of scoring $Y_i$ 2 samples:

A single LOCO–CRT test requires $Y_i$ 3.
Testing all $Y_i$ 4 features costs $Y_i$ 5, but with L1ME–CRT, the actual model refits may be far fewer than $Y_i$ 6 due to reuse for inactive features.

Typically, $Y_i$ 7 is $Y_i$ 8 for OLS or $Y_i$ 9 for Lasso; $P_X$ 0 is $P_X$ 1. Monte Carlo sample sizes $P_X$ 2–2000 are common, balancing $P_X$ 3-value granularity and computational burden.

Implementation suggestions include precomputing random seeds for reproducibility, vectorizing null-feature sampling when feasible, and using persistent model objects in languages like R or Python to avoid redundant refitting. These practicalities further enhance runtime efficiency.

For foundational and related methodologies, see "Panning for gold: 'model-X' knockoffs for high-dimensional controlled variable selection” (Candès, Fan, Janson & Lv), “Gene hunting with hidden Markov model knockoffs” (Sesia, Candès & Sabatti), and “Multiple testing with the conditional randomization test” (Li & Barber) in addition to the primary development of LOCO–CRT (Katsevich et al., 2020).

Markdown Report Issue Upgrade to Chat

References (1)

The leave-one-covariate-out conditional randomization test (2020)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to LOCO Conditional Randomization Test (LOCO CRT).

LOCO CRT: Conditional Randomization Test

1. Statistical Formulation and Model-X Framework

2. LOCO Test Statistic

3. Algorithmic Procedure and P-value Calculation

4. Theoretical Guarantees

5. L1ME–CRT Variant for L1-regularized M-Estimators

6. Closed-form Solution in the Multivariate Gaussian Covariate Case

7. Computational and Practical Considerations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

LOCO CRT: Conditional Randomization Test

1. Statistical Formulation and Model-X Framework

2. LOCO Test Statistic

3. Algorithmic Procedure and P-value Calculation

4. Theoretical Guarantees

5. L1ME–CRT Variant for L1-regularized M-Estimators

6. Closed-form Solution in the Multivariate Gaussian Covariate Case

7. Computational and Practical Considerations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research