Weighted Average Derivative Effects

Updated 9 December 2025

Weighted Average Derivative Effects (WADEs) are defined by integrating the local derivative of a conditional mean function with an arbitrary weighting measure, generalizing ADEs.
They facilitate efficient semiparametric estimation and robust causal inference across frameworks like quantile IV and kernel regression.
Optimal weighting, Riesz representer techniques, and kernel-based methods enhance WADEs’ practical implementation and policy interpretation.

Weighted average derivative effects (WADEs) generalize the concept of average derivative effects (ADEs) by integrating the local derivative of a structural or conditional mean function against an arbitrary weighting measure, producing a scalar estimand of central interest in nonparametric analysis, semiparametric efficiency, and causal inference. WADEs subsume the ADE as a special case, admit rich causal and policy interpretations, and serve as canonical functionals in quantile IV, kernel regression, and doubly robust estimation frameworks.

1. Formal Definition and General Properties

Let $(Y, A, Z)$ denote observed outcome, continuous exposure or treatment, and covariates, respectively. The conditional mean outcome function is $\mu(a, z) = E[Y \mid A = a, Z = z]$ , with $a \mapsto \mu(a, z)$ differentiable. For a weight function $w(a, z)$ satisfying $E\{w(A, Z)\} \neq 0$ , the WADE is

$\theta_w = E[w(A, Z) \, \partial_a \mu(A, Z)].$

When $w$ is normalized such that $E\{w(A, Z)\}=1$ , $\theta_w$ is the ordinary average of the local slopes. The special case $w(a, z) = 1$ yields the unweighted ADE, $\theta = E[\partial_a \mu(A, Z)]$ . In conditional mean regression $g(x) = E[Y \mid X=x]$ , the WADE associated with $w(x)$ is $\theta_w = \int w(x) \, g'(x) \, dx$ (Hines et al., 2023, Cattaneo et al., 2022).

2. Riesz Representer Characterization and Influence Functions

WADEs are bounded linear functionals on the Hilbert space $\mathcal{H}$ of square-integrable functions of $(A, Z)$ , yielding a representation via the Riesz representer $\alpha_w$ :

$\theta_w = \langle \alpha_w, \mu \rangle = E[\alpha_w(A, Z) \, Y].$

By integration by parts,

$\alpha_w(a, z) = -\frac{\partial}{\partial a} w(a, z) - w(a, z) \frac{\partial_a f(a \mid z)}{f(a \mid z)},$

where $f(a \mid z)$ denotes the conditional density of $A \mid Z=z$ . The double-robust, one-step estimator is then

$\hat\theta_w = n^{-1} \sum_{i=1}^n [\alpha_w(A_i, Z_i) \{ Y_i - \hat\mu(A_i, Z_i) \} + w(A_i, Z_i) \partial_a \hat\mu(A_i, Z_i)].$

This approach enables valid inference using cross-fitted or machine-learned nuisance estimators, provided mean-square consistency (Hines et al., 2023, Hines et al., 2021).

3. Semiparametric Efficiency, Optimal Weights, and Variance Bounds

The efficiency bound for WADEs is determined by the variance of the influence function. Denote heteroscedastic variance by $\sigma^2(a, z)$ . Among all admissible weights, the optimal $\alpha^*$ minimizing the nonparametric efficiency bound satisfies

$\alpha^*(a, z) = \frac{(a - \tilde\pi(z))/\sigma^2(a, z)}{E[A-\tilde\pi(Z)/\sigma^2(A, Z) \, A]},$

with $\tilde\pi(z)$ defined as

$\tilde\pi(z) = \frac{E[A/\sigma^2(A, Z) \mid Z=z]}{E[1/\sigma^2(A, Z) \mid Z=z]}.$

For homoscedasticity, the optimal WADE simplifies to

$\Psi = \frac{E\{\mathrm{Cov}(A, Y\mid Z)\}}{E\{\mathrm{Var}(A\mid Z)\}},$

estimable via sample-split cross-fitting:

$\hat\Psi = \frac{\sum_{i=1}^n (A_i - \hat\pi(Z_i))(Y_i - \hat\mu(Z_i))}{\sum_{i=1}^n (A_i - \hat\pi(Z_i))^2}.$

The nonparametric efficiency bound for $\theta^*$ is

$V^* = E[\alpha^*(A, Z)^2 \sigma^2(A, Z)],$

with consistent variance estimation via plug-in influence-function sample averages (Hines et al., 2023, Hines et al., 2021).

4. WADEs in Nonparametric Quantile IV and Ill-posed Inverse Problems

WADEs characterize key functionals in nonparametric quantile IV regression (NPQIV), which is associated with non-separable, nonlinear, and ill-posed inverse models:

$E[1\{ Y \leq h_0(W) \} - \tau \mid X] = 0 \quad \text{a.s.}$

$\theta_0 = E[\mu(W) h_0'(W)],$

with $\mu$ user-supplied and $h_0'$ the weak derivative. Semiparametric efficiency bounds for NPQIV WADEs depend on unknown conditional derivative operators and ill-posedness degree, rendering the information bound singular or non-singular according to $\ell \in \mathrm{Range}(\mathbf{T}^*)$ . Penalized sieve generalized empirical likelihood (GEL) estimators constructed from unconditional and growing unconditional moment restrictions achieve consistency and, under regular conditions, asymptotic normality (Chen et al., 2019).

5. Kernel-based Density-Weighted ADEs, Bandwidth, and Robust Inference

Density-weighted average derivative (DWAD) estimators operationalize WADEs in kernel regression contexts, with $w(x) = f(x)$ (density weighting):

$\theta = \int f(x) g'(x) dx = -2 E[Y \dot f(X)],$

estimable by symmetric kernel $K$ and bandwidth $h$ :

$\hat\theta = -2 \frac{1}{n(n-1)} \sum_{i\neq j} Y_i \frac{1}{h^{d+1}} \dot K\left(\frac{X_i - X_j}{h}\right).$

Classical asymptotically-linear (AL) theory requires restrictive smoothness and bandwidth conditions ( $n h^{2P} \to 0$ , $n h^{d+2}\to\infty$ ); inference is invariant to $h$ . The small-bandwidth (SB) framework relaxes these, allowing $h$ to shrink faster, produces robust finite-sample performance, and yields Gaussian limits for linear and quadratic terms with debiased variance estimators:

$\hat V_{SB} = n^{-1} \hat\Sigma - \binom{n}{2}^{-1} h^{-d-2} \hat\Delta.$

SB inference achieves lower coverage error and improved robustness to $h$ relative to AL (Cattaneo et al., 2022). Edgeworth expansions explicitly characterize bias, skewness, and coverage accuracy for both AL and SB regimes.

6. Causal and Policy Interpretations

WADEs admit causal interpretations as weighted averages of local incremental or “infinitesimal shift” effects, quantifying average outcome response to marginal changes in exposure under observable or stochastic intervention laws. For ADEs ( $w=1$ , $Q=f$ ), $\theta = E[ \partial_a \mu(A, Z) ]$ is the average derivative effect for infinitesimal shifts. Under general $Q$ and $w$ , the estimand

$\theta(Q) = \int_x \int_e w(e, x) \frac{\partial m(e, x)}{\partial e} dQ(e \mid x) dP_X(x)$

measures the average slope of $m(e, x)$ under intervention law $Q$ . The optimal $Q^*$ —identified via calculus of variations—minimizes the semiparametric variance bound among all such interventions, generalizing Crump overlap-weighted ATE principles to continuous exposures. WADEs thus serve as key estimands for incremental dose-response effects, overlap weighting, and robust causal analyses (Hines et al., 2023, Hines et al., 2021).

7. Practical Implementation and Guidelines

Basis Selection: Linear sieves (e.g., B-splines, wavelets, polynomials) for functional parameter approximation; component basis functions for instruments and exposures increase with sample size (Chen et al., 2019).
Penalty Tuning: Cross-validation, information criteria, or rule-of-thumb sequences (e.g., $\gamma_K \propto K^{-2m}$ ) subject to prior smoothness specifications (Chen et al., 2019).
Bandwidth in DWAD: Prefer small bandwidth for bias reduction in density-weighted estimation, ensuring $n^2 h^d \to \infty$ (Cattaneo et al., 2022).
Variance Estimation: Use empirical influence-function plug-in estimators for WADE SEs; in kernel settings, apply SB variance estimators for robust coverage (Cattaneo et al., 2022).
Cross-fitting: Adopt $K$ -fold sample splitting for nuisance function estimation in debiased machine-learning WADE estimators, facilitating root- $n$ consistency and valid inference under weak regularity (Hines et al., 2023, Hines et al., 2021).

WADEs unify average derivative methodologies in economics, causal inference, and semiparametric theory, facilitating efficient estimation, robust inference, and interpretable “local” effects in models with continuous exposures, arbitrary weighting, and ill-posed inverse structures.

Markdown Report Issue Upgrade to Chat

References (4)

Optimally weighted average derivative effects (2023)

Higher-order Refinements of Small Bandwidth Asymptotics for Density-Weighted Average Derivative Estimators (2022)

Parameterising the effect of a continuous exposure using average derivative effects (2021)

Penalized Sieve GEL for Weighted Average Derivatives of Nonparametric Quantile IV Regressions (2019)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Weighted Average Derivative Effects (WADEs).

Weighted Average Derivative Effects

1. Formal Definition and General Properties

2. Riesz Representer Characterization and Influence Functions

3. Semiparametric Efficiency, Optimal Weights, and Variance Bounds

4. WADEs in Nonparametric Quantile IV and Ill-posed Inverse Problems

5. Kernel-based Density-Weighted ADEs, Bandwidth, and Robust Inference

6. Causal and Policy Interpretations

7. Practical Implementation and Guidelines

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Weighted Average Derivative Effects

1. Formal Definition and General Properties

2. Riesz Representer Characterization and Influence Functions

3. Semiparametric Efficiency, Optimal Weights, and Variance Bounds

4. WADEs in Nonparametric Quantile IV and Ill-posed Inverse Problems

5. Kernel-based Density-Weighted ADEs, Bandwidth, and Robust Inference

6. Causal and Policy Interpretations

7. Practical Implementation and Guidelines

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research