Unbiased Estimation of Centered Moments
- Unbiased estimation of centered moments is a nonparametric method that uses U‑statistics to ensure the estimator’s expectation equals the true population moment.
- The methodology employs symmetric kernels that guarantee minimal variance and invariance under location and scale transformations.
- Applications in robust statistical inference and image analysis leverage these unbiased estimators to enhance outlier detection and shape analysis.
Unbiased estimation of centered moments is the task of formulating estimators whose expectation exactly matches the true (population) central moments. In nonparametric settings these estimators are typically derived using U‑statistics—symmetric functions of sample observations that achieve minimum variance—and are widely useful in applications ranging from robust statistical inference to image analysis. The approach commonly involves the construction of symmetric kernels for the kᵗʰ central moment and the careful paper of the corresponding kernel distributions, which exhibit useful properties such as location invariance and (near)unimodality.
1. Background and Definitions
Unbiased estimation of population moments has a long history dating back to Fisher’s work on k‑statistics. For a random variable X with distribution F and true mean μ, the rᵗʰ central moment is defined as μᵣ′ = E[(X − μ)ʳ]. An estimator Tₙ from a sample X₁, …, Xₙ is unbiased if E[Tₙ] = μᵣ′. In many modern applications the use of U‑statistics allows one to construct estimators for symmetric functionals (such as central moments) that achieve minimum variance. Here a symmetric kernel ψₖ, defined on k-tuples from the sample, is used to form the U‑statistic
U₍ₕₖ₎,ₙ = (1/₍ₙ choose k₎) Σ₍i₁,…,iₖ₎ ψₖ(Xᵢ₁, …, Xᵢₖ).
This framework is particularly advantageous in robust statistics and nonparametric inference where unbiasedness and efficiency are paramount.
2. U‑Statistics Approach to Centered Moments
The general construction of an unbiased estimator for the kᵗʰ central moment follows from Heffernan’s approach. For example, the kernel for the kᵗʰ moment is defined as
ψₖ(x₁, …, xₖ) = Σ₍j=0₎k–2 (–1)ʲ * (1/(k–j)) * (Σ xᵢ₁k–j xᵢ₂ * … * xᵢ₍ⱼ₊₁₎) + (–1)k–1(k–1) x₁⋯xₖ,
where the inner sum runs over all distinct indices with appropriate ordering. In the simplest case where k = 2 the kernel simplifies to
ψ₂(x₁, x₂) = ½ (x₁ – x₂)².
The resulting U‑statistic yields the unbiased sample variance; for k > 2 analogous symmetric kernels deliver unbiased estimators for higher centered moments. Because the kernels are taken to be symmetric functions of their arguments, the resulting estimators inherit desirable properties such as consistency and minimal variance in the sense of U‑statistic theory.
3. Structure and Properties of Central Moment Kernels
A key step in understanding these estimators is analyzing the structure of the kernel distributions. In particular, if X is drawn from a unimodal distribution then the distribution of the pairwise difference (X – X′), with X and X′ independent, is itself symmetric and unimodal with the mode at zero. This fact guarantees that for k = 2 the kernel distribution of ψ₂ is unimodal. For k > 2 the kernel distributions are “nearly unimodal” in the sense that both the mode and the median of the distribution lie close to zero. Moreover, the kernel functions are homogeneous under location and scale transformations:
ψₖ(λx₁ + μ, …, λxₖ + μ) = λᵏ ψₖ(x₁, …, xₖ).
This invariance implies that the U‑statistic estimators are fully nonparametric and, after appropriate standardization, are both location‑ and scale‑invariant. In many cases the kᵗʰ central moment kernel can be expressed as a (possibly infinite) mixture of “component quasi‑distributions” parameterized by the spread Δ derived from the quantile function. Such representations clarify the extremal properties of the kernels and relate them to classical Bernoulli (two‑point) distributions.
4. Estimator Construction and Practical Implementation
The construction of the unbiased estimators proceeds by averaging the kernel function over all distinct subsets of the sample. In practice, one computes
U₍ₕₖ₎,ₙ = (1/₍ₙ choose k₎) Σ₍i₁,…,iₖ₎ ψₖ(Xᵢ₁, …, Xᵢₖ).
In addition, robust versions of these estimators are obtained via modifications such as weighted Hodges–Lehmann (WHL) methods. The WHL central moments are defined as
WHLkmₖ,ε,γ,ₙ := LU₍ₕₖ,ₖ,ε,γ,ₙ₎,
where LU denotes a generalized L/U‑statistic evaluated on the ordered kernel values after applying a trimming level ε and a tuning parameter γ. Such modifications impart robustness against outliers and departures from parametric assumptions. Implementation may require working with explicit code (for example available via GitHub repositories associated with the research) and the use of numerical algorithms to compute combinatorial sums when k increases.
A summary of key formulas is as follows:
- Kernel for the kᵗʰ central moment: ψₖ(x₁, …, xₖ) = Σ₍j=0₎k–2 (–1)ʲ (1/(k–j)) Σ xᵢ₁k–j xᵢ₂ ⋯ xᵢ₍ⱼ₊₁₎ + (–1)k–1(k–1) x₁⋯xₖ.
- Unbiased estimator: U₍ₕₖ₎,ₙ = (1/₍ₙ choose k₎) Σ₍i₁,…,iₖ₎ ψₖ(Xᵢ₁, …, Xᵢₖ).
- Location invariance: ψₖ(λx₁+μ, …, λxₖ+μ) = λᵏ ψₖ(x₁, …, xₖ).
5. Practical Implications and Applications
Because these estimators are unbiased and robust, they underpin many applications in robust statistical inference. Unbiased estimation of variance (the second central moment) is essential in settings where the measurement scale must be invariant under translation and scaling. Robust counterparts for higher moments such as skewness and kurtosis are particularly useful in applications requiring outlier detection, automated classification, and in imaging problems where accurate characterization of shape is needed. For example, the unbiased ellipticity estimator in image analysis exploits a ratio of linear combinations of image moments derived from jointly normally distributed pixel values. In all these applications the nonparametric, symmetric nature of U‑statistic based estimators leads to improved efficiency over conventional plug‑in methods, particularly when sample sizes are small or when the underlying distribution is heavy‑tailed.
The following table summarizes representative estimators:
| Moment | Kernel Definition | Unbiased Estimator Example |
|---|---|---|
| Variance | ½ (x₁ – x₂)² | (1/₍ₙ choose 2) Σ₍i<j₎ ½ (Xᵢ – Xⱼ)² |
| Skewness | As given by ψ₃ with combinatorial sums | U₍ₕ₃₎,ₙ = (1/₍ₙ choose 3) Σ ψ₃(Xᵢ, Xⱼ, Xₖ); nearly unimodal kernel |
| Higher-order | Similar expressions involving (Xᵢ - Xⱼ)ᵏ | General U‑statistic with kernel ψₖ |
In weighted cases, discrete weights (e.g., based on measurement precision) enter through the weighted moments; explicit correction factors are given to remove sample size bias (often analogous to Bessel’s correction). However, simulation studies show that unbiased estimators, while more accurate near the true population value, may have larger variance than their biased counterparts, a classic bias‑variance tradeoff that demands application-specific compromise.
6. Extensions, Open Problems, and Conclusion
While unbiased estimation of central moments via U‑statistics is well understood for variance, the structure of unbiased estimators for higher-order moments (e.g. skewness, kurtosis) is more intricate. In particular, the construction of unbiased estimators using constant coefficients for products of higher central moments remains an open mathematical problem, though preliminary results indicate that an infinite family of alternative estimators exists (as exemplified by average‑adjusted unbiased variances). Moreover, extensions to semiparametric and multivariate settings—as well as applications to estimation problems in noncommutative C*‑algebras—demonstrate how the general principles of kernel invariance, combinatorial structure, and robust nonparametric estimation unite disparate areas of mathematical statistics.
In summary, the unbiased estimation of centered moments uses the theory of U‑statistics to provide estimators that are entirely nonparametric, invariant under location and scale transformations, and robust to outliers. Although unbiasedness may come at the cost of increased variance, these estimators are theoretically and practically significant, facilitating reliable inference in both classical and modern statistical applications.