Semiparametric Efficiency Theory
- Semiparametric efficiency theory is a framework that defines the minimal asymptotic variance for estimators in models with both finite-dimensional and infinite-dimensional components.
- It leverages geometrical concepts in Hilbert spaces, using tangent space projections to construct efficient scores, influence functions, and establish information bounds.
- Applications in partially linear additive models demonstrate practical efficiency gains through smooth backfitting and one-step correction, enhancing estimator performance under regularity conditions.
Semiparametric efficiency theory provides a rigorous framework for characterizing the minimal asymptotic variance achievable by any regular estimator in models with both finite-dimensional (parametric) and infinite-dimensional (nonparametric) components. Central to this theory is the geometric structure of the underlying Hilbert space of score functions, the construction and projection of tangent spaces, and the explicit derivation of efficient scores, influence functions, and information bounds. The analysis in the context of partially linear additive models—where additive structure is imposed on the nonparametric component—reveals both conceptual and practical efficiency gains when exploiting structural information in the nuisance part. This paradigm fundamentally shapes modern approaches to semi-parametric inference and estimator construction.
1. Model Structure and Regularity
The canonical partially linear additive model for semiparametric efficiency theory is
with , , , and independent. Regularity and identifiability are enforced by
- Centering: for all ,
- Smoothness: each is ,
- Joint density bounded and bounded away from zero,
- for some ,
- absolutely continuous, symmetric, with density satisfying and .
The Hilbert space of additive functions (zero-centered, square integrable under ) is defined to formalize the geometric tangent space for the nonparametric component.
2. Tangent Spaces and Efficient Score Construction
Let denote the score for the full model under differentiable submodels in both the parametric and nonparametric directions,
where is a tangent direction in . The nuisance tangent space consists of all elements of the form .
Efficient score construction then proceeds by projecting orthogonally onto the complement of the nuisance tangent space with respect to the inner product. The least-favorable direction is obtained by solving
Let and . The resulting efficient score is
and, at the truth ,
3. Semiparametric Fisher Information Bound and Influence Function
The semiparametric Fisher information matrix is given by
where . The Cramér–Rao lower bound asserts that, for any regular estimator , .
The efficient influence function attaining this bound is
with and .
4. Construction of Semiparametrically Efficient Estimators
Efficient estimation is achieved via a two-step ("profile plus one-step") procedure:
- Step A: Construct the Gaussian-profile estimator:
- For candidate , regress on additively using smooth backfitting, obtaining , form .
- Fit by backfitting all on .
- Define and .
- Obtain the estimator:
Under Gaussian noise, , , which is only semiparametrically efficient if is Gaussian.
- Step B: Apply a one-step adaptation to correct for non-Gaussian :
- Compute residuals .
- Estimate via kernel-density estimation on the residuals (exploiting symmetry).
- Update, forming the estimated information:
4. Define the one-step estimator:
Under standard regularity and estimation conditions, one attains
ensuring semiparametric efficiency.
5. The Impact of Nonparametric Structure and Efficiency Gains
The essential insight is that modeling the nonparametric component with additive structure (as opposed to completely nonparametric ) facilitates a strictly smaller nuisance tangent space , and thus the projection defining the efficient score is less aggressive. The quantity admits an additive structure-compliant -projection, making estimation of the parametric component more efficient. Consequently, the information matrix is strictly larger (i.e., the bound is lower) than in the unrestricted partially linear model when is non-additive, as the additive assumption successfully removes nonidentifiable nuisance directions.
Simulation results demonstrate that the smooth-backfitted Gaussian-profile estimator ("SAM") outperforms classical profile-kernel estimators for the partially linear model, often by large mean-squared-error factors for complex structure. The adaptive one-step ("ASAM") estimator achieves additional efficiency gains when error distributions are non-Gaussian, empirically reducing MSE and respecting the theoretical lower bound.
6. Practical Implementation, Regularity, and Limitations
Implementation requires smooth additive regression (e.g., via smooth backfitting), kernel density-derivative estimation for , and precise centering of . All algorithms admit computationally tractable forms for moderate dimensions (alleviating the curse of dimensionality). Key regularity assumptions include:
- Twice differentiability of ,
- Boundedness and positivity of joint densities,
- Symmetry and sufficient smoothness of ,
- Independence of from .
Performance may degrade for high due to the quality of additive approximations and the kernel estimation step. However, the overall framework is robust and generalizes to partial linear models with further structured nonparametric components.
7. Broader Context and Applications
This framework generalizes the Bickel–Klaassen–Ritov–Wellner approach for semiparametric models, emphasizing the construction of the tangent space for the precise nonparametric structure imposed. Efficient influence functions and estimation procedures, including smooth backfitting and sample-splitting for , are central regardless of the statistical model, and have informed much subsequent work on double machine learning and structured semiparametric regression. When applied to real data (e.g., Boston housing), the proposed method not only fits well but also correctly flags cases where non-Gaussian residual structure is present, thereby providing more reliable inference on covariate effects.