Neyman-Orthogonal Score in Semiparametrics

Updated 29 May 2026

Neyman-orthogonal score is an estimating equation for a target parameter that remains first-order insensitive to errors in high-dimensional nuisance estimates.
It underpins debiased and double machine learning methods, ensuring asymptotic normality when nuisance estimators converge faster than n^(-1/4).
The approach aligns with semiparametric efficiency via pathwise differentiability and guides robust regressor balancing in complex models.

A Neyman-orthogonal score is an estimating equation for a low-dimensional target parameter that is constructed to be locally insensitive—specifically, first-order insensitive—to errors in the estimation of high- or infinite-dimensional nuisance parameters. Neyman-orthogonality underpins debiased/double machine learning (DML), provides a geometric link to pathwise differentiability and semiparametric efficiency, and enables efficient estimation and valid inference when nuisance functions are estimated via flexible, possibly slow-converging, machine learning methods. Key to this property is that plugging in slightly misspecified nuisance estimates has only a second-order effect on the target parameter estimator's bias, so $n^{-1/2}$ rates and asymptotic normality are achievable as long as nuisance rates exceed $n^{-1/4}$ . Modern methods for causal inference, high-dimensional inference, and robust statistical learning exploit Neyman-orthogonal scores to construct estimators and test statistics with strong robustness to nuisance misspecification (Kato, 7 May 2026, Chen et al., 16 Mar 2026, Chernozhukov et al., 2017, Foster et al., 2019).

1. Rigorous Definition and Structural Form

The Neyman-orthogonal (efficient) score $\psi(W;\eta_0,\theta_0)$ is constructed for a model with observed data $W=(X,Y)$ (generally $X = (D,Z)$ in causal inference) and low-dimensional target $\theta_0$ . The parameter of interest is typically defined as $\theta_0 = \mathbb{E}[m(W;\gamma_0)]$ , with $m(W;\gamma)$ depending linearly on a nuisance regression $\gamma(x) = \mathbb{E}[Y|X=x]$ . By the Riesz representation theorem, there exists a unique representer $\alpha_0(X)$ such that:

$n^{-1/4}$ 0

for all $n^{-1/4}$ 1. The score is:

$n^{-1/4}$ 2

with $n^{-1/4}$ 3.

The orthogonality property is formalized as: $n^{-1/4}$ 4 or, for any direction $n^{-1/4}$ 5,

$n^{-1/4}$ 6

This ensures first-order insensitivity: changes in the nuisance parameters around the truth do not alter the expectation of the score at leading order (Kato, 7 May 2026, Chen et al., 16 Mar 2026, Chernozhukov et al., 2017).

2. Connection to Semiparametric Efficiency and Pathwise Differentiability

Neyman-orthogonality is formally equivalent, under local product structure and smoothness, to pathwise differentiability in the sense of modern semiparametric theory. Pathwise differentiability asserts the existence of an influence function $n^{-1/4}$ 7 such that, for every smooth submodel through $n^{-1/4}$ 8, the directional derivative of the parameter functional is $n^{-1/4}$ 9 for any model score $\psi(W;\eta_0,\theta_0)$ 0 (Chen et al., 16 Mar 2026).

Equivalence is formalized via:

If $\psi(W;\eta_0,\theta_0)$ 1 is Neyman-orthogonal ( $\psi(W;\eta_0,\theta_0)$ 2) and $\psi(W;\eta_0,\theta_0)$ 3, then the model is pathwise differentiable with influence function $\psi(W;\eta_0,\theta_0)$ 4, where $\psi(W;\eta_0,\theta_0)$ 5 (Chen et al., 16 Mar 2026).
Conversely, the existence of an efficient influence function $\psi(W;\eta_0,\theta_0)$ 6 implies that, under local product structure, there exists a $\psi(W;\eta_0,\theta_0)$ 7 such that $\psi(W;\eta_0,\theta_0)$ 8 and $\psi(W;\eta_0,\theta_0)$ 9 is Neyman-orthogonal.

Semiparametric efficiency implies that an estimator constructed by solving $W=(X,Y)$ 0 is asymptotically optimal if the score is Neyman-orthogonal and the nuisance is estimated at an adequate rate (Chen et al., 16 Mar 2026, Chernozhukov et al., 2017).

3. Error Structure and the Role of Regression Balancing

When plugging in nuisance estimators $W=(X,Y)$ 1, the empirical score error, or "Neyman error", can be decomposed as:

$W=(X,Y)$ 2

This separates into a mean-zero noise component and a deterministic "balancing gap" drift term: $W=(X,Y)$ 3

Exact balancing (eliminating the deterministic component) requires that the empirical means of certain functions of the regressors, specified by the Riesz representer and the error structure, are matched. This motivates "regressor balancing" via Riesz regression, which directly targets the functions relevant for the error in the Neyman-orthogonal score. In contrast, "covariate balancing" using functions of $W=(X,Y)$ 4 alone suffices only when the score error depends on covariates and not interactions with treatment. Failure to address the interaction leads to residual bias, e.g., in the presence of treatment effect heterogeneity (Kato, 7 May 2026).

4. Construction and Use in Machine Learning-Based Estimation

In double/debiased machine learning, estimation proceeds as follows:

Estimate nuisance components (e.g., regression function $W=(X,Y)$ 5, propensity score, Riesz representer $W=(X,Y)$ 6) using arbitrary ML or regularized methods, possibly with cross-fitting to avoid overfitting bias (Chernozhukov et al., 2017, Nekipelov et al., 2018).
Construct the Neyman-orthogonal score $W=(X,Y)$ 7.
Solve the empirical moment equation in $W=(X,Y)$ 8 using the score.

The essential property is that any small bias in the nuisance estimation, by orthogonality, affects $W=(X,Y)$ 9 only to second order; if nuisance estimation rates exceed $X = (D,Z)$ 0, $X = (D,Z)$ 1-consistent and asymptotically normal inference for the target is possible. Explicit forms for the score and cross-fitting algorithms are central to this approach (Chernozhukov et al., 2017, Nekipelov et al., 2018, Foster et al., 2019). Riesz regression and higher-order orthogonalization generalize this construction to complex function classes and models with slow-converging nuisances (Kato, 7 May 2026, Bonhomme et al., 11 May 2026).

5. Generalizations: Higher-Order Orthogonality and Bayesian Settings

Higher-order Neyman orthogonality demands that all mixed partial derivatives up to order $X = (D,Z)$ 2 of the population moment vanish at the truth. This further reduces sensitivity to nuisance error: the estimator's bias is $X = (D,Z)$ 3, enabling root- $X = (D,Z)$ 4-consistency for nuisance rates as slow as $X = (D,Z)$ 5 (Mackey et al., 2017, Bonhomme et al., 11 May 2026, Bonhomme et al., 2024). Closed-form constructions for higher-order-orthogonal scores and their use in bias-corrected GMM and panel models have been developed (Bonhomme et al., 11 May 2026, Bonhomme et al., 2024).

In Bayesian inference, Neyman orthogonality justifies two-step plug-in procedures for the posterior of the target parameter: the plug-in bias is negligible if a Neyman orthogonal score is used, and credible intervals based on the Bayesian bootstrap achieve asymptotically correct frequentist coverage (Sabbagh et al., 23 Feb 2026).

6. Practical Implications and Guidelines

Practical deployment demands that:

The orthogonal score is constructed to match the structure of the regression error; regressor balancing is necessary in the presence of interactions between treatment and covariates or general heterogeneity (Kato, 7 May 2026).
For ATT estimation, error is purely a function of $X = (D,Z)$ 6, so covariate balancing suffices; for ATE under heterogeneity, balancing of treatment-covariate interactions is essential (Kato, 7 May 2026).
Plug-in estimators using orthogonal scores admit cross-fitting and ML-based nuisance estimation, yielding valid inference without requiring nuisance rates faster than $X = (D,Z)$ 7 (Chernozhukov et al., 2017, Nekipelov et al., 2018, Foster et al., 2019).

The construction of Neyman-orthogonal scores is foundational to the validity and performance of modern high-dimensional inference, causal effect estimation, policy evaluation, and semiparametric learning frameworks.

Key references: (Kato, 7 May 2026, Chen et al., 16 Mar 2026, Chernozhukov et al., 2017, Bonhomme et al., 11 May 2026, Mackey et al., 2017, Nekipelov et al., 2018, Sabbagh et al., 23 Feb 2026, Ren et al., 29 Apr 2026, Foster et al., 2019, Bonhomme et al., 2024).