Semiparametric Fisher Information in Models parametrized by a Normed Space

Published 1 Apr 2026 in math.ST | (2604.00655v1)

Abstract: This paper studies semiparametric Fisher information in models parametrized by general normed spaces. The main contribution is to establish that positive semiparametric Fisher information is equivalent to the gradient of the parameter of interest lying in the range of the adjoint score operator. This result generalizes a key theorem Van Der Vaart (1991) and provides a unified framework linking differentiability and information, beyond Hilbert spaces. The paper develops a normed-space mean-square-differentiable models for two canonical problems: estimation of the average of a known transformation and estimation of a density at a point. In these applications, it shows that positive information holds if and only if the transformation has finite variance and if and only if the density has positive mass at the evaluation point, respectively. These findings offer a novel information-theoretic perspective on known minimax results and clarify the conditions under which root-n estimation is possible.

Abstract PDF Upgrade to Chat

Authors (1)

Telmo Pérez-Izquierdo

Summary

The paper proves that positive semiparametric Fisher information is equivalent to the gradient being in the adjoint score operator's range.
It employs a quotient space construction and functional analysis to address challenges beyond standard Hilbert space geometry.
Applications include heavy-tailed mean estimation and pointwise density estimation, clarifying conditions for root-n efficiency.

Semiparametric Fisher Information in Models Parametrized by a Normed Space

Introduction and Motivation

This paper addresses a fundamental question in semiparametric inference: when is semiparametric Fisher information positive, and how does this relate to the differentiability of functionals of interest, especially in models parametrized by general normed spaces instead of Hilbert spaces? The work generalizes the canonical result of van der Vaart (1991) to normed models, formalizing the precise condition for positive Fisher information—namely, that the gradient of the parameter lies in the range of the adjoint score operator. This equivalence yields an abstract yet powerful analytic tool for problems where standard Hilbert-space geometry fails, such as estimation under heavy tails or pointwise density estimation.

Setup and Problem Formulation

Consider models parametrized by elements $\lambda$ in a normed space $V$ , leading to dominated families $\mathcal{P} = \{P_\lambda: \lambda \in \Lambda\}$ with associated tangent space $\mathcal{T} \subset V$ . Statistical functionals $\psi: \Lambda \to \mathbb{R}$ are assumed pathwise differentiable with gradient $\dot\psi \in \mathcal{T}^*$ . Mean-square differentiability of the model is captured by a continuous linear score operator $A: \mathcal{T} \to L_2(P_0)$ . The parameter estimation target is $\beta_0 = \psi(\lambda_0)$ . The central object is the semiparametric Fisher information:

$\mathcal{I} \equiv \inf_{\alpha \in \mathcal{T}} \frac{ \|A\alpha\|_2^2}{|\dot{\psi}\alpha|^2 }.$

The pivotal condition, generalizing Hilbert space intuition, is that positive semiparametric Fisher information ( $\mathcal{I} > 0$ ) holds if and only if $V$ 0, where $V$ 1 is the adjoint score operator.

Main Results

The paper’s main theorem formalizes the above equivalence: positive semiparametric Fisher information is necessary and sufficient for the gradient of the target functional to be in the range of the adjoint of the score operator, irrespective of whether $V$ 2 is a Hilbert or general normed space. This unifies the analytic structure of semiparametric efficiency bounds:

Main Theorem: $V$ 3 if and only if $V$ 4.

Unlike the Hilbert case, the proof does not rely on orthogonal projection or Parseval’s identity, but instead uses the structure of the tangent space quotient $V$ 5, and arguments from functional analysis about the dual and adjoint operators. Positive information is tightly characterized via the geometry of the range and kernel of the score operator, extended to the dual spaces.

Local identifiability, i.e., $V$ 6, is shown to be both necessary and sufficient (under closedness of the score range) for positive information, and the machinery extends through functional analytic duality relations and quotient space construction.

Applications to Canonical Statistical Problems

Two distinct problems exemplify the analytic reach of the general theorem:

Estimating the Mean of a Known Transformation

Let $V$ 7 for $V$ 8, and consider $V$ 9. The model is parametrized by deviations $\mathcal{P} = \{P_\lambda: \lambda \in \Lambda\}$ 0 in $\mathcal{P} = \{P_\lambda: \lambda \in \Lambda\}$ 1 (the dual space to $\mathcal{P} = \{P_\lambda: \lambda \in \Lambda\}$ 2), with $\mathcal{P} = \{P_\lambda: \lambda \in \Lambda\}$ 3 and $\mathcal{P} = \{P_\lambda: \lambda \in \Lambda\}$ 4 the inclusion into $\mathcal{P} = \{P_\lambda: \lambda \in \Lambda\}$ 5. The gradient is given by $\mathcal{P} = \{P_\lambda: \lambda \in \Lambda\}$ 6.

Criterion: Information is positive if and only if $\mathcal{P} = \{P_\lambda: \lambda \in \Lambda\}$ 7.
When $\mathcal{P} = \{P_\lambda: \lambda \in \Lambda\}$ 8 falls outside $\mathcal{P} = \{P_\lambda: \lambda \in \Lambda\}$ 9 (the heavy-tailed regime), $\mathcal{T} \subset V$ 0 and root- $\mathcal{T} \subset V$ 1 estimation is not possible.
This corresponds to minimax rates slower than $\mathcal{T} \subset V$ 2, perfectly aligning semiparametric information geometry with minimax lower bounds for heavy-tailed mean estimation.

Estimating Density at a Point

For $\mathcal{T} \subset V$ 3 (the space of continuous functions), suppose the parameter of interest is $\mathcal{T} \subset V$ 4 for some $\mathcal{T} \subset V$ 5. The local parametrization uses deviations $\mathcal{T} \subset V$ 6 with $\mathcal{T} \subset V$ 7 for $\mathcal{T} \subset V$ 8 supported locally at $\mathcal{T} \subset V$ 9. The gradient is $\psi: \Lambda \to \mathbb{R}$ 0, with the adjoint score operator mapping to regular Borel measures.

Criterion: Information is positive if and only if the underlying reference measure $\psi: \Lambda \to \mathbb{R}$ 1 has positive mass at $\psi: \Lambda \to \mathbb{R}$ 2, i.e., $\psi: \Lambda \to \mathbb{R}$ 3 (as in the discrete case).
For $\psi: \Lambda \to \mathbb{R}$ 4 the Lebesgue measure, information vanishes, reflecting the impossibility of root- $\psi: \Lambda \to \mathbb{R}$ 5 estimation for density at a fixed point under standard smoothness.

Both results reframe classical minimax theory in terms of the geometry of score and adjoint operators, providing not only necessary but also sufficient analytic conditions for regular estimation.

Theoretical Implications

This work achieves a rigorous unification between differentiability, the analytic geometry of adjoint operators, and lower bounds for estimation in broad normed settings. It establishes the general functional-analytic condition under which root- $\psi: \Lambda \to \mathbb{R}$ 6 estimation is possible in semiparametric or nonparametric models, subsuming and clarifying the scope of prior results for Hilbert space models. The duality between tangent space geometry and the nature of the statistical functional connects efficiency theory, minimax theory, and pathwise differentiability in a single analytic framework.

Moreover, the quotient space construction and conditions involving closedness of $\psi: \Lambda \to \mathbb{R}$ 7 provide a template for handling cases where direct Hilbert methods break down (non-square-integrable settings, point evaluation functionals, etc.).

Practical Consequences and Future Directions

The criterion exposed has direct implications for the design of estimators and the choice of modeling restrictions in high-dimensional, heavy-tailed, and nonparametric problems. In complex models, it indicates:

When information is strictly positive, semiparametric efficiency bounds and root- $\psi: \Lambda \to \mathbb{R}$ 8 regular estimators exist.
When the condition fails, minimax rates must necessarily be slower, and alternative strategies (e.g., trimming, aggregation, robustification) may be required.

In practice, the criterion can be used diagnostically: functional gradients, spaces, and score operators can be directly checked for the range condition, clarifying in advance when efficiency theory applies or standard estimators will be rate-optimal.

A natural avenue of future exploration is to relax the domination, mean-square differentiability, and global pathwise differentiability assumptions, potentially treating infinite-dimensional or more intricate statistical manifolds.

Conclusion

By generalizing the core equivalence between differentiability and positive information to general normed space models, this paper establishes a precise algebraic-analytic condition for semiparametric efficiency. The structural characterization of positive Fisher information through the range of the adjoint score operator clarifies the limitations and possibilities for regular inference beyond classical Hilbert-space settings, operationalizing efficiency theory for a broad class of statistical models.

Markdown Report Issue