Papers
Topics
Authors
Recent
2000 character limit reached

Epps–Pulley Statistic in Normality Testing

Updated 12 November 2025
  • Epps–Pulley statistic is an L²-type goodness-of-fit test based on the empirical characteristic function, offering affine invariance and explicit asymptotic theory for normality assessment.
  • The test uses a tuning parameter β to balance sensitivity between heavy-tailed and smooth departures from normality, with optimal power at intermediate β values.
  • Its extensions to multivariate and functional data, along with favorable Bahadur efficiency, position it as a strong benchmark compared to traditional EDF-based tests.

The Epps–Pulley statistic, based on the empirical characteristic function (ECF), is a weighted L2L^2-type goodness-of-fit test for normality (and other parametric families) that enjoys affine invariance and explicit asymptotic theory. Originally conceived for univariate data, it has been extended to multivariate and even functional data settings, and its theoretical properties—most notably, Bahadur efficiency—have been analyzed in detail for a range of alternatives. This class of tests is widely regarded as a benchmark for goodness-of-fit to the normal distribution, especially in moderate to high dimensions, due to its amenability to extension and its favorable power properties under central and mixture-type alternatives.

1. Definition and Formulae

Let X1,,XnX_1, \dots, X_n be i.i.d. observations in Rd\mathbb{R}^d, and denote by

Yn,j=Σn1/2(XjXˉn),j=1,,n,Y_{n,j} = \Sigma_n^{-1/2}(X_j - \bar X_n), \quad j=1,\dots, n,

the affine-invariant standardized residuals, with sample mean Xˉn\bar X_n and sample covariance Σn\Sigma_n. The empirical characteristic function is

φn(t)=1nj=1neitYn,j,tRd.\varphi_n(t) = \frac{1}{n} \sum_{j=1}^n e^{i t^\top Y_{n,j}},\quad t \in \mathbb{R}^d.

Under normality, the theoretical characteristic function is φ0(t)=exp(t2/2)\varphi_0(t) = \exp(-\|t\|^2/2). The Epps–Pulley statistic with tuning parameter β>0\beta > 0 (sometimes denoted γ\gamma in the univariate case) is

Tn,β=nRdφn(t)φ0(t)2wβ(t)dt,T_{n,\beta} = n \int_{\mathbb{R}^d} \big| \varphi_n(t) - \varphi_0(t) \big|^2\, w_\beta(t)\,dt,

with the Gaussian weight

wβ(t)=1(2π)d/2det(βI)1/2exp(12t(βI)1t).w_\beta(t) = \frac{1}{(2\pi)^{d/2} \det(\beta I)^{1/2}} \exp\left(-\frac{1}{2} t^\top (\beta I)^{-1} t\right).

In d=1d=1, a closed form is available: Tn,β=1nj,k=1neβ22(Yn,jYn,k)221+β2j=1neβ22(1+β2)Yn,j2+n1+2β2.T_{n,\beta} = \frac{1}{n} \sum_{j,k=1}^n e^{-\frac{\beta^2}{2} (Y_{n,j} - Y_{n,k})^2} - \frac{2}{\sqrt{1+\beta^2}} \sum_{j=1}^n e^{-\frac{\beta^2}{2(1+\beta^2)} Y_{n,j}^2} + \frac{n}{\sqrt{1+2\beta^2}}.

This statistic, and its multivariate extension—sometimes termed the BHEP test—is affine invariant and universally consistent for the normal family in any dimension.

2. Tuning Parameter and Sensitivity

The tuning parameter β\beta in wβ(t)w_\beta(t) regulates the test's sensitivity to various departures from normality:

  • Small β\beta: Places weight on large t\|t\|, raising sensitivity to heavy tails and fine, high-frequency deviations (e.g., Cauchy-like alternatives).
  • Large β\beta: Concentrates on small t\|t\|, enhancing power against broad, smooth deviations (e.g., small skewness/kurtosis).

Empirical and Bahadur-efficiency analyses consistently recommend intermediate values (β0.5\beta \approx 0.5–1) for robust general-purpose power (Ebner et al., 2021, Meintanis et al., 2022). For heavy-tailed alternatives, a smaller β\beta (0.25\approx 0.25) is suitable; for mild deviations, moderate to large β\beta (2\approx 2–3) may be justified.

3. Asymptotic Null Distribution and Eigenvalue Problem

Under the null hypothesis, Tn,βT_{n,\beta} converges in distribution to a weighted sum of independent χ12\chi^2_1 variables: Tn,βdTβ=j=1λj(β)Nj2,T_{n,\beta} \stackrel{d}{\rightarrow} T_\beta = \sum_{j=1}^\infty \lambda_j(\beta) N_j^2, where the λj(β)\lambda_j(\beta) are eigenvalues of the integral operator on L2(Rd,wβ)L^2(\mathbb{R}^d, w_\beta) with covariance kernel K(s,t)K(s, t) determined by the Gaussian null and the parameter estimation regime (Ebner et al., 2021). In d=1d=1, explicit computation of eigenvalues has been achieved (Ebner et al., 2021), and truncation at M=20M = 20 is sufficient for practical approximation of quantiles: Tβj=1Mλj(β)Nj2,NjN(0,1).T_\beta \approx \sum_{j=1}^M \lambda_j(\beta) N_j^2,\qquad N_j \sim \mathcal{N}(0,1). For practical use, one can estimate critical values either via this series or by Monte Carlo resampling under the normal model.

4. Bahadur Slope and Local Efficiencies

For alternatives GθG_\theta close to normality, large deviation principles and Bahadur theory provide the asymptotic (local) efficiency of the test, relative to the optimal (likelihood ratio) test. The Bahadur slope is

cT(θ)=bT(0)2λ1θ2+o(θ2),c_T(\theta) = \frac{b_T''(0)}{2 \lambda_1} \theta^2 + o(\theta^2),

where λ1\lambda_1 is the largest eigenvalue of the integral operator and bT(θ)b_T(\theta) is the test's limiting expectation under the alternative (Meintanis et al., 2022). The corresponding local approximate Bahadur efficiency (LABE) is

eff(Tn)=bT(0)4λ1K(0),\mathrm{eff}(T_n) = \frac{b_T''(0)}{4 \lambda_1 K''(0)},

with K(θ)K(\theta) the local Kullback–Leibler information.

Table: LABE of the BHEP test (univariate, selected γ\gamma)

Alternative γ=0.5\gamma=0.5 γ=1\gamma=1 γ=2\gamma=2
Lehmann 0.75 0.86 0.94
1st Ley-Paindaveine 0.95 0.99 0.99
Contamination 0.50 0.60 0.68
Energy (γ\gamma=1) 0.57

A summary for six classical alternatives and a broad range of β\beta values, as in (Ebner et al., 2021), demonstrates that for β[0.5,1]\beta \in [0.5,1], the Epps–Pulley test either matches or outperforms EDF-based tests (KS, Cramér–von Mises, Anderson–Darling) over the full alternative battery.

5. Multivariate and Functional Extensions

  • Multivariate BHEP: For XjRpX_j \in \mathbb{R}^p and Gaussian weight w(t)=exp(t2/(2β))w(t) = \exp(-\|t\|^2/(2\beta)), the multivariate BHEP statistic remains

Tn=nRpφn(t)et2/22w(t)dt,T_n = n \int_{\mathbb{R}^p} | \varphi_n(t) - e^{-\|t\|^2/2} |^2 w(t) dt,

with similar affine invariance and limiting null law as above (Meintanis et al., 2022).

  • Functional data: The approach generalizes in (Henze et al., 2019) to Hilbert-space (functional) data, replacing the characteristic function with the empirical characteristic functional and introducing a suitable probability measure QQ as kernel/weight. The resulting test remains consistent, with null distribution approximated via a parametric bootstrap.

6. Practical Implementation and Recommendations

  • Computational Complexity: The univariate closed-form version (pairwise sum) is O(n2)O(n^2). Sub-sampling or fast Gauss transform acceleration can be employed in large nn regimes (Ebner et al., 2021).
  • Critical Value Determination: Either simulate the limiting Gaussian quadratic form, given the leading eigenvalues, or use a parametric bootstrap under the normal fit (Meintanis et al., 2022, Henze et al., 2019).
  • Parameter Tuning: β\beta in [0.5,1][0.5, 1] is widely recommended for robust performance; practitioners facing heavy-tailed alternatives may lower β\beta to $0.25$ (Ebner et al., 2021, Meintanis et al., 2022).
  • Comparison with Other Tests: For β=0.5\beta=0.5 the Epps–Pulley test outperforms Kolmogorov–Smirnov on all close alternatives, and for β=1,2,3\beta = 1,2,3 it is superior to Cramér–von Mises, Watson, and Watson–Darling tests over the six-alternative set (Ebner et al., 2021, Meintanis et al., 2022).

7. Impact and Extensions

The Epps–Pulley/BHEP framework provides a principled, explicit, and flexible family of goodness-of-fit tests for normality and exponentiality, with generalization to high dimensions and to infinite-dimensional contexts. Its analytical tractability permits precise asymptotic power calculations, supporting informed choice of tuning parameters. Recent work emphasizes its optimality within the class of affine-invariant L2L^2-type tests and its consistently superior (or competitive) Bahadur efficiency over EDF-based competitors for a broad range of alternatives (Meintanis et al., 2022, Ebner et al., 2021).

A plausible implication is that, for moderate to large sample sizes and particularly for high-dimensional settings, the Epps–Pulley/BHEP test with judicious tuning provides a default methodology for normality assessment, combining interpretability, consistency, and power. Further research continues to explore its refinements, computational acceleration, and generalizations to other distributional families.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Empirical Characteristic Function (Epps–Pulley Statistic).