Debiased Lasso Estimator

Updated 29 May 2026

Debiased Lasso is a high-dimensional method that adds a linear correction to regular Lasso estimates to achieve asymptotic normality.
It employs known or data-driven precision matrix estimation, such as nodewise Lasso, to enable valid confidence intervals and hypothesis testing.
Under sparsity conditions, the estimator attains near-optimal minimax rates, providing robust performance in regimes where p is much larger than n.

A debiased Lasso estimator is a high-dimensional statistical technique that transforms the biased, regularized Lasso estimate into an asymptotically normal estimator by adding a carefully designed linear correction. This correction, which can be computed using either a known covariance or a data-driven estimate, restores valid construction of confidence intervals and hypothesis tests for individual regression coefficients under scaling regimes where $p \gg n$ and the parameter vector is sparse. The methodology is supported by a rigorous Gaussian central limit theorem conditional on the design and achieves nearly optimal sample size and minimax rates under suitable model assumptions.

1. Construction of the Debiased Lasso Estimator

Consider a high-dimensional Gaussian linear regression model: $y = X\theta^* + w, \quad w \sim N(0, \sigma^2 I_n), \quad X \in \mathbb{R}^{n \times p}$ with design matrix $X$ consisting of i.i.d. rows from $N(0, \Sigma)$ and $p \gg n$ . The standard Lasso estimator is given by: $\hat\theta^{\rm lasso} = \arg\min_{\theta \in \mathbb{R}^p} \left\{ \frac{1}{2n}\|y - X\theta\|_2^2 + \lambda\|\theta\|_1 \right\}.$ However, $\hat\theta^{\rm lasso}$ is biased due to the $\ell_1$ penalty. The debiased estimator augments $\hat\theta^{\rm lasso}$ with a linear correction: $\hat\theta^{\rm d} = \hat\theta^{\rm lasso} + \frac{1}{n} M X^\top (y - X\hat\theta^{\rm lasso}),$ where $y = X\theta^* + w, \quad w \sim N(0, \sigma^2 I_n), \quad X \in \mathbb{R}^{n \times p}$ 0 is chosen to approximate (or estimate) $y = X\theta^* + w, \quad w \sim N(0, \sigma^2 I_n), \quad X \in \mathbb{R}^{n \times p}$ 1, the precision matrix of the design (Javanmard et al., 2015, Smith, 1 Apr 2026).

Choice of $y = X\theta^* + w, \quad w \sim N(0, \sigma^2 I_n), \quad X \in \mathbb{R}^{n \times p}$ 2:

If $y = X\theta^* + w, \quad w \sim N(0, \sigma^2 I_n), \quad X \in \mathbb{R}^{n \times p}$ 3 is known, set $y = X\theta^* + w, \quad w \sim N(0, \sigma^2 I_n), \quad X \in \mathbb{R}^{n \times p}$ 4.
If $y = X\theta^* + w, \quad w \sim N(0, \sigma^2 I_n), \quad X \in \mathbb{R}^{n \times p}$ 5 is unknown but $y = X\theta^* + w, \quad w \sim N(0, \sigma^2 I_n), \quad X \in \mathbb{R}^{n \times p}$ 6 is sparse, estimate $y = X\theta^* + w, \quad w \sim N(0, \sigma^2 I_n), \quad X \in \mathbb{R}^{n \times p}$ 7 via nodewise Lasso regression: for each $y = X\theta^* + w, \quad w \sim N(0, \sigma^2 I_n), \quad X \in \mathbb{R}^{n \times p}$ 8, regress $y = X\theta^* + w, \quad w \sim N(0, \sigma^2 I_n), \quad X \in \mathbb{R}^{n \times p}$ 9 on $X$ 0, and combine the coefficients and residual variances into $X$ 1 (Javanmard et al., 2015).

2. Conditions for Valid Inference and Asymptotic Normality

The debiased Lasso estimator achieves uniform Gaussian limiting distributions for each coordinate under the following model and sparsity conditions:

The vector $X$ 2 is $X$ 3-sparse with $X$ 4.
The rows of $X$ 5 are i.i.d. $X$ 6 with bounded eigenvalues ( $X$ 7).
The maximum row or column sparsity of $X$ 8 is $X$ 9.
Main regime for asymptotic normality: $N(0, \Sigma)$ 0 with suitable compatibility or restricted eigenvalue condition (Javanmard et al., 2015, Smith, 1 Apr 2026).

Theorems:

Known $N(0, \Sigma)$ 1: For sufficiently sparse $N(0, \Sigma)$ 2, $N(0, \Sigma)$ 3 is asymptotically $N(0, \Sigma)$ 4 and the maximal bias is $N(0, \Sigma)$ 5 (Javanmard et al., 2015).
Unknown $N(0, \Sigma)$ 6: If $N(0, \Sigma)$ 7 and $N(0, \Sigma)$ 8, then the same Gaussian limit holds with an adaptively constructed $N(0, \Sigma)$ 9 and the remainder

$p \gg n$ 0

Efficiency: For loss minimization, the estimator is minimax-optimal up to a $p \gg n$ 1 multiplicative factor when $p \gg n$ 2 has i.i.d. Gaussian rows and $p \gg n$ 3.

3. Statistical Inference: Confidence Intervals and Hypothesis Testing

The nearly unbiased, asymptotically normal distribution enables valid coordinatewise inference for $p \gg n$ 4. If $p \gg n$ 5 is consistently estimated (e.g., via the scaled-Lasso), an asymptotic $p \gg n$ 6 confidence interval is: $p \gg n$ 7 where $p \gg n$ 8 is the standard normal quantile (Javanmard et al., 2015, Smith, 1 Apr 2026).

4. Trade-offs: Covariance Estimation and Sparsity

The tightness of the bias control and validity of inference depends on both the sparsity of $p \gg n$ 9 and that of the precision matrix $\hat\theta^{\rm lasso} = \arg\min_{\theta \in \mathbb{R}^p} \left\{ \frac{1}{2n}\|y - X\theta\|_2^2 + \lambda\|\theta\|_1 \right\}.$ 0:

Dense $\hat\theta^{\rm lasso} = \arg\min_{\theta \in \mathbb{R}^p} \left\{ \frac{1}{2n}\|y - X\theta\|_2^2 + \lambda\|\theta\|_1 \right\}.$ 1: Bottleneck for inference is $\hat\theta^{\rm lasso} = \arg\min_{\theta \in \mathbb{R}^p} \left\{ \frac{1}{2n}\|y - X\theta\|_2^2 + \lambda\|\theta\|_1 \right\}.$ 2, matching the restrictions of earlier work.
Sufficiently Sparse $\hat\theta^{\rm lasso} = \arg\min_{\theta \in \mathbb{R}^p} \left\{ \frac{1}{2n}\|y - X\theta\|_2^2 + \lambda\|\theta\|_1 \right\}.$ 3: If $\hat\theta^{\rm lasso} = \arg\min_{\theta \in \mathbb{R}^p} \left\{ \frac{1}{2n}\|y - X\theta\|_2^2 + \lambda\|\theta\|_1 \right\}.$ 4, optimality is retained for $\hat\theta^{\rm lasso} = \arg\min_{\theta \in \mathbb{R}^p} \left\{ \frac{1}{2n}\|y - X\theta\|_2^2 + \lambda\|\theta\|_1 \right\}.$ 5.
Intermediate Regimes: The bias remainder scales with $\hat\theta^{\rm lasso} = \arg\min_{\theta \in \mathbb{R}^p} \left\{ \frac{1}{2n}\|y - X\theta\|_2^2 + \lambda\|\theta\|_1 \right\}.$ 6; the dominant term determines the sparsity/inference trade-off (Javanmard et al., 2015).

5. Minimax-Optimal and Enhanced Estimation Procedures

For i.i.d. Gaussian designs with $\hat\theta^{\rm lasso} = \arg\min_{\theta \in \mathbb{R}^p} \left\{ \frac{1}{2n}\|y - X\theta\|_2^2 + \lambda\|\theta\|_1 \right\}.$ 7, a two-step estimation procedure achieves the minimax $\hat\theta^{\rm lasso} = \arg\min_{\theta \in \mathbb{R}^p} \left\{ \frac{1}{2n}\|y - X\theta\|_2^2 + \lambda\|\theta\|_1 \right\}.$ 8-risk (up to a negligible factor) (Javanmard et al., 2015):

Compute $\hat\theta^{\rm lasso} = \arg\min_{\theta \in \mathbb{R}^p} \left\{ \frac{1}{2n}\|y - X\theta\|_2^2 + \lambda\|\theta\|_1 \right\}.$ 9 with optimal penalty.
Debias to $\hat\theta^{\rm lasso}$ 0.
Soft-threshold each coordinate $\hat\theta^{\rm lasso}$ 1 at a level $\hat\theta^{\rm lasso}$ 2. The resulting estimator satisfies

$\hat\theta^{\rm lasso}$ 3

matching the minimax rate up to $\hat\theta^{\rm lasso}$ 4 (Javanmard et al., 2015).

6. Proof Techniques and Technical Ingredients

Key analytical elements that underlie debiased Lasso results include:

Restricted eigenvalue and compatibility condition analysis, following Bickel–Ritov–Tsybakov.
Concentration of sample covariance matrices, e.g., via Rudelson–Zhou bounds.
Perturbation and leave-one-out arguments for improved bias characterization, reducing reliance on coarse $\hat\theta^{\rm lasso}$ 5– $\hat\theta^{\rm lasso}$ 6 inequalities.
Nodewise Lasso estimation and its theoretical control—enabling adaptive, data-driven inverse covariance construction.
Minimax lower bounds for the attainable width of confidence intervals, using reductions to Gaussian mean testing (Javanmard et al., 2015).

7. Extensions, Efficiency, and Open Directions

The debiased Lasso framework motivates a spectrum of extensions:

Asymptotic Efficiency: In certain regimes, a carefully constructed $\hat\theta^{\rm lasso}$ 7 (not simply the true $\hat\theta^{\rm lasso}$ 8) can achieve asymptotic efficiency with variance potentially smaller than the classical $\hat\theta^{\rm lasso}$ 9, via sparse approximations to generally non-sparse precision matrix columns (Geer, 2017).
Practical Implementation: Empirical studies confirm that the debiased estimator attains reliable coverage, controls type I error, and can outperform simpler projection-based estimators, especially in correlated or non-ideal settings (Smith, 1 Apr 2026).
Robustness and Adaptivity: The nodewise Lasso approach enables application where only partial structural information about $\ell_1$ 0 is available, trading off between feasibility, efficiency, and validity.
Algorithmic Scalability: Closed-form computation of the debiasing weights is possible under row-uncorrelated Gaussian designs, yielding major computational advantages in large-scale regimes without sacrificing inferential performance (Banerjee et al., 27 Feb 2025).

The debiased Lasso estimator thus provides a general-purpose methodology for high-dimensional linear inference with optimal minimax, asymptotic, and computational properties, subject to explicit conditions on model sparsity, covariance structure, and sample size (Javanmard et al., 2015, Banerjee et al., 27 Feb 2025, Geer, 2017, Smith, 1 Apr 2026).

Markdown Report Issue Upgrade to Chat

References (4)

De-biasing the Lasso: Optimal Sample Size for Gaussian Designs (2015)

Debiased Estimators in High-Dimensional Regression: A Review and Replication of Javanmard and Montanari (2014) (2026)

On the efficiency of the de-biased Lasso (2017)

Fast Debiasing of the LASSO Estimator (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Debiased Lasso Estimator.