Post-Double-Selection LASSO
- The paper introduces a double-selection procedure that simultaneously selects covariates for both the outcome and regressor to mitigate omitted variable bias.
- PDS-LASSO utilizes Neyman-orthogonal moments to achieve √n-consistent and asymptotically normal estimates in high-dimensional settings.
- Empirical evidence and simulations reveal its effectiveness in stochastic-frontier analysis and many-instrument scenarios compared to single-selection methods.
Post-Double-Selection LASSO (PDS-LASSO) is a variable-selection and estimation framework designed for high-dimensional linear models, particularly when inference about a low-dimensional parameter is required in the presence of potentially many nuisance covariates. The method augments standard LASSO selection by conducting variable selection for both the outcome and the regressor of interest, followed by unpenalized estimation on the union of selected controls. This double selection, paired with Neyman-orthogonal moments, yields √n-consistent, asymptotically normal inference and mitigates omitted variable bias in settings where the number of covariates may exceed the number of observations (Parmeter et al., 20 May 2025, Hué et al., 26 Nov 2025, Chiang et al., 2019).
1. Model Framework and Identifying Conditions
PDS-LASSO is posed within a high-dimensional linear regression or partially linear model where inference is required for a low-dimensional structural parameter (e.g., a treatment effect or a coefficient on a key input) in the presence of high-dimensional controls:
- Standard partially linear model:
where is the scalar outcome, is the regressor of interest, is the high-dimensional control vector, and is the low-dimensional parameter of interest (Hué et al., 26 Nov 2025). In stochastic-frontier settings, the model generalizes to
where are primary inputs and are high-dimensional environmental (inefficiency-shifting) covariates (Parmeter et al., 20 May 2025).
- Assumptions:
- Approximate sparsity: Only a small subset of regressors are relevant: , or more generally, is -sparse in variables ().
- Restricted Eigenvalue (RE)/Compatibility: The Gram matrix of controls must satisfy compatibility to ensure identifiability and control the LASSO error (Hué et al., 26 Nov 2025).
- Exogeneity: Errors have mean zero conditional on controls; for stochastic-frontier, and (Parmeter et al., 20 May 2025).
- Moments and Tails: Variables and errors are sub-Gaussian/bounded moments of sufficiently high order.
- Orthogonality/Neyman-Orthogonality: The score function used in final estimation must be first-order insensitive to small estimation error in nuisance parameters (Parmeter et al., 20 May 2025).
2. The PDS-LASSO Algorithm
PDS-LASSO proceeds as follows:
- First Stage (LASSO for ):
Select (Hué et al., 26 Nov 2025).
- Second Stage (LASSO for ):
Select .
- Union of Selected Controls:
This ensures that any variable affecting either the regressor of interest or the outcome is included.
- Final Estimation (Post-LASSO OLS/COLS):
For stochastic-frontier models, analogous three-block selection and estimation steps are performed, including cross-fitting or sample splitting to ensure independence of selection and estimation phases (Parmeter et al., 20 May 2025).
A table summarizing the three main stages:
| Stage | Operation | Selected Set |
|---|---|---|
| 1. LASSO for | ||
| 2. LASSO for | ||
| 3. Final OLS |
3. Neyman-Orthogonality and Bias Elimination
The principal advantage of PDS-LASSO lies in the orthogonal construction of the final estimating equation:
- Orthogonal Score: The resulting moment condition is first-order robust to small errors in the estimation of nuisance parameters. For the partially linear model, Neyman-orthogonality (partialling out in both and ) ensures that specification mistakes in variable selection do not introduce leading-order bias in inference about (Hué et al., 26 Nov 2025, Parmeter et al., 20 May 2025).
- In stochastic-frontier models: Construction of a Neyman-orthogonal moment via partialling out from and precedes estimation of the frontier parameters. This approach restores √n-consistency even when (Parmeter et al., 20 May 2025).
- Mathematical statement: Neyman-orthogonality holds if
Such moment conditions yield final estimators whose asymptotic law does not depend on first-stage selection mistakes of order .
- Implication: Single-selection LASSO (i.e., stepwise inclusion only for or ) can leave bias of order in the coefficient of interest, whereas PDS-LASSO eliminates this to (Parmeter et al., 20 May 2025).
4. Theoretical Guarantees and Inference
Under approximate sparsity (), Restricted Eigenvalue assumptions, and appropriate regularization ():
- √n-Consistency and Asymptotic Normality:
with consistently estimated via plug-in residuals from the final OLS (Hué et al., 26 Nov 2025, Chiang et al., 2019).
- Oracle Inequalities: If the true support is contained in , the final estimator matches the performance (rate and limiting distribution) of the "oracle" estimator that knows the true support.
- Uniformly Valid Inference: As the final regression is Neyman-orthogonal, post-selection inference is valid without data-snooping corrections beyond robust/clustered standard errors as dictated by the error structure (Hué et al., 26 Nov 2025, Chiang et al., 2019).
- Extensions to Clustering: The method adapts to cluster settings by using cluster-robust standard errors, and by improving variance estimation to account for clustering in the design (Chiang et al., 2019).
5. Algorithmic Implementation and Practical Considerations
- Penalty selection: The LASSO penalty parameter plays a critical role. If set too high, important covariates may be omitted (inducing finite sample bias); too low, and the estimator becomes high variance due to over-selection (Hué et al., 26 Nov 2025).
- Sample splitting/cross-fitting: Used to ensure the independence between variable selection and final estimation, improving validity of inference (Parmeter et al., 20 May 2025).
- Software: PDS-LASSO can be implemented via standard LASSO routines, often with post-processing for variable selection and final OLS.
6. Applications, Empirical Performance, and Comparisons
- Stochastic-Frontier (Efficiency) Analysis: For estimating efficiency frontiers in settings with "big (wide) data," PDS-LASSO enables reliable measurement of inefficiency parameters by robustly selecting environmental covariates and avoiding spurious "no-inefficiency" artifacts seen with naive or single-selection approaches (Parmeter et al., 20 May 2025).
- Empirical Evidence: Monte Carlo simulations and empirical studies highlight PDS-LASSO’s small bias, accurate coverage, and robustness to high-dimensional control vectors, contrast to single-LASSO and naive methods (Parmeter et al., 20 May 2025, Chiang et al., 2019).
- Limitations: In finite samples, particularly with strongly negatively correlated controls or weak signal variables just below the LASSO threshold, the method may suffer omitted variable bias if relevant covariates are excluded in both selection steps (Hué et al., 26 Nov 2025). Other selection methods (e.g., Post-Double-Autometrics) have been proposed to address these circumstances.
- Comparison to 2SLS: In many-instrument settings, PDS-LASSO avoids the weak-instrument and over-ID bias typical of traditional two-stage least squares, while maintaining inferential validity under high dimensionality (Parmeter et al., 20 May 2025).
7. Methodological Developments and Alternatives
- Post-Double-Autometrics: As an inference-based alternative, Post-Double-Autometrics utilizes classical t-tests in general-to-specific algorithms to ensure higher "potency" (fewer missed confounders), with superior finite-sample performance and smaller RMSE demonstrated in empirical and Monte Carlo comparisons (Hué et al., 26 Nov 2025).
- Post-Double Selection under Multi-way Clustering: The framework generalizes to arbitrary multi-way clustered sampling, provided cluster-robust variance estimation is employed (Chiang et al., 2019).
PDS-LASSO constitutes a core methodology for valid inference in high-dimensional linear and efficiency models under approximate sparsity, with theoretical properties and practical performance established across diverse econometric and statistical settings (Parmeter et al., 20 May 2025, Hué et al., 26 Nov 2025, Chiang et al., 2019).