Papers
Topics
Authors
Recent
Search
2000 character limit reached

Quadratic Mean Differentiability & S-DQM

Updated 18 March 2026
  • Quadratic Mean Differentiability (QMD) is a smoothness condition that guarantees an L2 expansion of density functions, underpinning local asymptotic normality and the behavior of MLE.
  • The S-DQM extension relaxes classical regularity by summing quadratic errors over adaptive designs, making it applicable even when designs are non-i.i.d.
  • Both QMD and S-DQM provide a basis for rigorous asymptotic theory, facilitating accurate inference and confidence interval construction in complex experimental settings.

Quadratic mean differentiability (QMD), also known as differentiability in quadratic mean (DQM), is a central regularity condition in parametric statistical theory, underpinning the local asymptotic normality (LAN) of statistical experiments and the asymptotic normality of maximum likelihood estimators (MLE). The condition formalizes the notion that a family of probability densities sufficiently "smoothly" depends on a parameter, not only pointwise but in an L2L^2 sense. While classically developed for the independent and identically distributed (i.i.d.) setting, QMD is crucial in extending likelihood-based inference to more general scenarios—including adaptive and sequential experimental designs where classical regularity conditions typically fail.

1. Definition and Core Principle

Let {pθ(x):θΘ}\{p_\theta(x): \theta \in \Theta\} be a parametric family of densities on a measurable space (X,μ)(\mathcal X, \mu), and let θ0\theta_0 be an interior point of ΘRp\Theta \subset \mathbb{R}^p. The log-likelihood and its score function are given by: (x;θ)=logpθ(x),˙(x;θ)=θlogpθ(x).\ell(x; \theta) = \log p_\theta(x),\quad \dot{\ell}(x; \theta) = \frac{\partial}{\partial \theta} \log p_\theta(x). The family is said to be differentiable in quadratic mean (DQM) at θ0\theta_0 if a measurable score function ˙(x;θ0)\dot{\ell}(x; \theta_0) exists such that, as h0h \to 0,

X{pθ0+h(x)pθ0(x)12h˙(x;θ0)pθ0(x)}2dμ(x)=o(h2).\int_\mathcal{X} \left\{ \sqrt{p_{\theta_0+h}(x)} - \sqrt{p_{\theta_0}(x)} - \frac{1}{2} h^{\top} \dot{\ell}(x; \theta_0)\sqrt{p_{\theta_0}(x)} \right\}^2 d\mu(x) = o(\|h\|^2).

This characterizes the local L2L^2 expansion of the model, with implications for the LAN property and asymptotic normality of estimators.

2. QMD in Classical and Adaptive Designs

In the i.i.d. setting, QMD alone is sufficient to ensure that likelihood ratio statistics enjoy a quadratic approximation, and that MLE are asymptotically normal with variance governed by the Fisher information. Classical results (e.g., van der Vaart 1998, Le Cam 1986) rely on QMD for sharp local approximations to the likelihood.

However, in adaptive experiments where covariates XiX_i are selected sequentially based on past data (Xj,Yj)j<i(X_j, Y_j)_{j<i},

Xi=h((Xj,Yj)j<i,Ui1),X_{i} = h((X_j, Y_j)_{j < i}, U_{i-1}),

the pairs (Xi,Yi)(X_i, Y_i) are no longer i.i.d. The conditional law of XiX_i changes with each observation, destroying the uniformity needed for standard DQM arguments. Existing literature has addressed this by imposing strong regularity: uniform existence and boundedness of second or third derivatives, and domination conditions allowing differentiation under the integral sign for all (x,y)(x, y). These are often too strong, especially for unbounded or evolving covariate spaces, or when only minimal smoothness in θ\theta holds.

3. Summable Differentiability in Quadratic Mean (S-DQM)

To extend QMD methods to adaptive settings, Christensen, Stoltenberg, and Hjort introduce the summable differentiability in quadratic mean (S-DQM) condition (Christensen et al., 2023). For a regression sequence (Xi,Yi)(X_i, Y_i) with YiXi=xY_i \mid X_i = x density fθ(yx)f_\theta(y \mid x) and uθ(yx)=θlogfθ(yx)u_\theta(y \mid x) = \partial_\theta \log f_\theta(y \mid x), define

Dθ,h(x)={fθ+h(yx)fθ(yx)huθ(yx)fθ(yx)}2dμ(y).D_{\theta, h}(x) = \int \left\{ \sqrt{f_{\theta + h}(y \mid x)} - \sqrt{f_\theta(y \mid x)} - h^{\top} u_\theta(y \mid x) \sqrt{f_\theta(y \mid x)} \right\}^2 d\mu(y).

The family is S-DQM at θ0\theta_0 iff for all fixed hh,

i=1nEθ0[Dθ0,h/n(Xi)]=o(1),n.\sum_{i=1}^n E_{\theta_0} \left[ D_{\theta_0, h / \sqrt{n}} (X_i) \right] = o(1), \quad n \to \infty.

This condition requires only that the sum of quadratic-mean errors across the possibly dependent design sequence is negligible, rather than strong uniform control on second or third derivatives. In the i.i.d. case, S-DQM reduces to the classical QMD condition.

4. Asymptotic Theory under S-DQM: Local Expansion and MLE Behavior

Under S-DQM, the log-likelihood admits a quadratic local expansion in hh:

  • Define the normalized score martingale Un,j=n1/2i=1juθ0(YiXi)U_{n, j} = n^{-1/2}\sum_{i=1}^j u_{\theta_0}(Y_i \mid X_i), and its predictable quadratic variation.
  • Under S-DQM, bounded fourth moment of uθ0u_{\theta_0}, and tightness of the quadratic variation,

An(h)=n(θ0+h/n)n(θ0)=hUn,nhUn,,Un,nh+op(1).A_n(h) = \ell_n(\theta_0 + h/\sqrt{n}) - \ell_n(\theta_0) = h^{\top}U_{n,n} - h^{\top} \langle U_{n, \cdot}, U_{n, \cdot}\rangle_n h + o_p(1).

If An(h)A_n(h) is almost surely concave and the limit quadratic variation converges in probability to a nonrandom invertible matrix JJ, then

n(θ^nθ0)dN(0,J1),\sqrt{n}(\widehat{\theta}_n - \theta_0) \overset{d}{\longrightarrow} \mathcal{N}(0, J^{-1}),

where convergence is in distribution (Christensen et al., 2023).

5. S-DQM in Canonical Adaptive Designs

Bruceton ("Up-and-Down") Design

With Yi{0,1}Y_i \in \{0,1\}, P(Yi=1Xi=x)=H(α+βx)P(Y_i = 1 \mid X_i = x) = H(\alpha + \beta x) for logistic or probit HH, the design sets

X1=x1,Xi={Xi1d,Yi1=1, Xi1+d,Yi1=0.X_1 = x_1, \quad X_i = \begin{cases} X_{i-1} - d, & Y_{i-1}=1, \ X_{i-1} + d, & Y_{i-1}=0. \end{cases}

(Xi)(X_i) forms a Markov chain. The S-DQM condition is satisfied via exponential decay of the information Jθ(x)J_\theta(x) and differentiability properties, yielding MLE asymptotic normality with covariance J1J^{-1}, where JJ is the stationary average information.

Robbins–Monro Design

Assuming Yi=M(Xi)+ϵiY_i = M(X_i) + \epsilon_i with E(ϵiXi)=0E(\epsilon_i \mid X_i) = 0, the adaptive sequence Xi=Xi1ai1Yi1X_{i} = X_{i-1} - a_{i-1}Y_{i-1} (with ai2<\sum a_i^2<\infty, ai=\sum a_i = \infty) converges almost surely. S-DQM holds with twice continuous differentiability of fθ(yx)\sqrt{f_\theta(y \mid x)} around the accumulation point γ\gamma, so MLE are again asymptotically normal.

Markovian Langlie Design

The design and models are specified for Yi{0,1}Y_i \in \{0,1\}, covariate state space [0,1][0,1], and Markov updates with i.i.d. uniform randomness. S-DQM can be established due to compactness and smoothness, ensuring asymptotic normality of MLE via the same mechanism.

6. S-DQM versus Classical QMD: Technical Consequences

In the i.i.d. case, classical QMD enables LAN and asymptotic normality with minimal assumptions. In adaptive regression, uniform (in xx) conditions on derivatives are typically needed to control the error per iteration, which is not tenable for evolving or unbounded design sequences. S-DQM provides only quadratic‐mean control over the entire design sequence, weakening these requirements to

i=1nEDθ0,h/n(Xi)=o(1),\sum_{i=1}^n E D_{\theta_0, h/\sqrt{n}}(X_i) = o(1),

and thus offers a clean pathway to

n(θ^nθ0)  d  N(0,J1),\sqrt{n}(\widehat{\theta}_n-\theta_0)\;\overset{d}{\to}\;\mathcal{N}(0, J^{-1}),

by relying on martingale central limit theory and concavity of An(h)A_n(h) (Christensen et al., 2023).

7. Scope of Applications and Extensions

The S-DQM framework strictly weakens classical regularity requirements, as it does not demand the existence or boundedness of second or higher derivatives of logfθ(yx)\log f_\theta(y \mid x). Only first differentiability in a Hellinger-distance sense is required. Coverage includes:

  • Adaptive designs with either discrete or continuous state spaces,
  • Both convergent and non-convergent design sequences,
  • Models admitting nonparametric or semiparametric extension,
  • Adaptive designs with multi-dimensional or hierarchical dependence,
  • Models with heavy-tailed or mixture responses admitting DQM only after appropriate weighting,
  • Sequential and bandit-type problems with dependencies arising from exploration–exploitation trade-offs.

A plausible implication is the extension of likelihood-based confidence interval theory (such as Fieller-type intervals) to broader classes of adaptive experiments, without recourse to truncations or specialized analytic bounds. This suggests potential for further development in nonparametric, semiparametric, and complex dependence frameworks driven by adaptive and sequential statistical methodology (Christensen et al., 2023).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Quadratic Mean Differentiability (QMD).