Instrumental Variable Regression: STIV Insights

Updated 8 October 2025

Instrumental variable regression is a method that uses external instruments to address endogeneity in causal inference.
The STIV estimator employs convex optimization with l1 penalties to enforce sparsity and manage weak instruments in high-dimensional settings.
Robust confidence sets and variable selection techniques ensure reliable inference even in complex models, as demonstrated in applications like the EASI demand system.

Instrumental variable regression is a central econometric and statistical method for inference on causal effects when explanatory variables are endogenous—i.e., correlated with the error term due to omitted variables, simultaneity, or measurement error. Modern variants address high-dimensional designs and weak identification by combining convex optimization, adaptive regularization, and identification-robust inference.

1. High-Dimensional Linear IV Modeling and Endogeneity

In high-dimensional IV regression, the number of regressors $d_x$ (potentially endogenous) and instruments $d_z$ can be comparable to or exceed the sample size $n$ . The model is

$Y = X^{\top}\beta + U(\beta)$

with the key IV moment restriction

$\E[Z U(\beta)] = 0,$

and $Z$ the vector of instruments (possibly partially endogenous). The challenge is that only a sparse (or approximately sparse) subset of the high-dimensional parameter vector $\beta$ is nonzero, and proper identification must be maintained under many weak instruments and endogeneity.

2. The Self-Tuning Instrumental Variables (STIV) Estimator

To address high-dimensionality and endogeneity, the STIV estimator solves the convex program: $\min_{\beta,\;\sigma\ge 0}\,\Big\{\,\big\lVert D_X^{-1}\beta_{S_Q}\big\rVert_1 + c\,\sigma\,:\, \big\lVert D_Z\,\mathbb{E}_n\big[Z\,(Y-X^{\top}\beta)\big]\big\rVert_\infty \le \widehat{r}\,\sigma,\quad \widehat{\sigma}(\beta) \le \sigma\Big\}$ where:

$D_X$ and $D_Z$ are diagonal scaling matrices (inverse sample standard deviations),
$S_Q$ is the set of regressors to be penalized for sparsity,
$c$ is a penalization constant,
$\widehat{r}$ is a data-driven tuning parameter for the moment violation,
$\widehat{\sigma}(\beta)$ is an estimate of the error standard deviation.

A key feature is the $\ell^1$ penalty on a subset of $\beta$ , enforcing sparsity, and a constraint on the maximal empirical moment violation, rescaled by the estimated error standard deviation and tuning parameter. The entire program is a linear program (or, with alternative penalty, a conic program), solvable in polynomial time even with thousands of variables.

3. Identification-Robust Inference and Sensitivity Analysis via Linear Programming

Classical confidence intervals for IV estimates suffer in high dimensions and under weak instruments. STIV constructs identification-robust confidence sets by test inversion, relying on the pivotal statistic

$\widehat{t}(b) = \frac{\|\;D_Z\cdot\E_n [Z U(b)]\;\|_\infty}{\widehat{\sigma}(b)},$

with the robust confidence set given by the sublevel set $\widehat{t}(b) \leq \widehat{r}$ .

Exact computation is infeasible in high dimension, so the approach convexifies the problem: it defines sensitivity constants

$\widehat{\kappa}_{\ell, S} = \min_{\Delta \in \widehat{K}_S,\;\ell(\Delta) = 1} \|\widehat{\Psi} \Delta\|_\infty,\qquad \widehat{\Psi} = D_Z [\mathbb{E}_n (ZX^\top)] D_X,$

where $\widehat{K}_S$ is a cone encoding approximate sparsity. For any loss $\ell$ , the estimation error is bounded as

$\ell(D_X^{-1}(\widehat{\beta}-\beta)) \leq \frac{2\widehat{r} \overline{\sigma} \gamma(\widehat{r}/\widehat{\kappa}_{g,S})}{\widehat{\kappa}_{\ell,S}},$

with the relevant constants and “inflation factor” $\gamma(\cdot)$ . The sensitivity parameters $\widehat{\kappa}$ are themselves computed via linear programs, making the robust confidence sets computable in polynomial time, irrespective of $d_x$ or $d_z$ .

The overall robust confidence set is

$\mathcal{C}(s) = \{\,b\in \mathbb{B}: \forall \ell \in \mathcal{L},\;\ell(D_X^{-1}(\widehat{\beta}-b)) \leq (2\widehat{r} \overline{\sigma} \gamma(\widehat{r}/\widehat{\kappa}_g(s)))/\widehat{\kappa}_\ell(s)\,\},$

ensuring validity even when instruments are many and weak.

4. Convergence Rates, Variable Selection, and Adaptation to Sparsity

The estimator’s finite-sample error is nonasymptotically controlled: $|D_X^{-1}(\widehat{\beta} - \beta)|_q \lesssim \frac{2\widehat{r} \overline{\sigma}\,\gamma(\widehat{r}/\widehat{\kappa}_{g,S(\beta)})}{\widehat{\kappa}_{\ell, S(\beta)}},$ so error is determined by the chosen slack $\widehat{r}$ , noise level, model sparsity $s=|\mathcal{S}(\beta)|$ , and instrument/design complexity via $\widehat{\kappa}$ .

Support recovery (“exact variable selection”) is obtained by thresholding: for each component,

$\widehat{\beta}_k^{(\widehat{\omega})} = \widehat{\beta}_k \cdot 1\{\sqrt{\mathbb{E}_n[X_k^2]}|\widehat{\beta}_k| > \widehat{\omega}_k(s)\},\qquad \widehat{\omega}_k(s)= \frac{2\,\widehat{r}\overline{\sigma}\gamma(\widehat{r}/\widehat{\kappa}_g(s))}{\widehat{\kappa}_{\ell_k}(s)}.$

If nonzero coefficients exceed these data-driven thresholds (a “beta-min” condition), their support is consistently identified.

5. Empirical Application: The EASI Demand System

The EASI (Exact Affine Stone Index) demand system is a flexible, high-dimensional model for household expenditure shares using series expansions or polynomials (often with thousands of regressors). Challenges include endogeneity of high-order terms, economic restrictions (homogeneity, symmetry), and approximate sparsity.

STIV is particularly suited because:

The LP (or conic) structure allows direct solution with thousands of variables.
Tuning (choice of $\widehat{r}$ and penalties) is self-calibrating, depending only on the data.
Confidence sets remain accurate under weak identification.
Variable selection procedures efficiently identify relevant basis components.
In application, second-order (quadratic) EASI approximation, estimated with STIV, significantly reduces demand estimation error compared to first-order or conventional two-stage methods. The resulting confidence bands for Engel curves are robust and informative, indicating, for instance, valid identification of peer effects and price elasticities.

6. Implementation and Computational Aspects

STIV’s convex formulation allows direct, efficient computation via off-the-shelf LP or conic programming solvers, even in very high-dimensional problems. Sensitivity analysis and construction of robust confidence sets reduce to a small collection of LPs per confidence region or variable selection problem.

Typical workflow:

Standardize $X$ and $Z$ via $D_X$ , $D_Z$ .
Formulate and solve the STIV program for $\widehat{\beta}$ and $\widehat{\sigma}$ .
Solve linear programs for sensitivity constants as needed for confidence sets or error bounds.
Apply thresholding rules for variable selection and support recovery.
Construct robust, identification-valid confidence bands via the convexified region.

The approach is self-tuning: critical parameters like $\widehat{r}$ are set using moderate deviation bounds, making the procedure largely automatic.

7. Summary of Theoretical and Practical Contributions

The STIV framework enables inference in linear IV models with many endogenous regressors, controlling for weak identification, high-dimensionality, and approximate model sparsity. Key advances are:

Convex (LP/conic) penalized estimation with self-normalizing moment constraints.
Robust confidence sets via LP-based computation, valid uniformly over sizable identification regions.
Nonasymptotic convergence rates scaling with sparsity and instrument strength.
Data-adaptive variable selection with explicit beta-min thresholds.
Empirical validation on complex structural systems where both model approximation and endogeneity are severe.

This methodology offers a statistically principled and computationally tractable approach for modern econometric inference in settings with large-scale, endogenous, and potentially weakly-identified systems (Gautier et al., 2011).

PDF Markdown Chat (Pro)

References (1)

High-dimensional instrumental variables regression and confidence sets (2011)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Instrumental Variable Regression.