Papers
Topics
Authors
Recent
2000 character limit reached

Weighted Target Estimation

Updated 4 October 2025
  • Weighted target estimation is a statistical method that reweights each observation's score based on its fit, ensuring robust and efficient parameter estimation.
  • It modifies the maximum likelihood equation by assigning adaptive weights derived from residuals comparing the empirical and model distributions.
  • Empirical studies demonstrate that this approach maintains full efficiency under ideal models while effectively mitigating the influence of outliers and model violations.

Weighted target estimation refers to a set of statistical methodologies in which individual data contributions are reweighted to produce more robust, efficient, or targeted parameter estimates. These reweighting techniques are often motivated by the need for robustness to outliers, handling contamination, or ensuring that the estimator remains fully efficient under the assumed parametric model. The work by Markatou, Basu, and Lindsay (Majumder et al., 2016) develops a new weighted likelihood approach aimed at achieving robustness without sacrificing efficiency—by constructing carefully designed weights for each observation’s contribution in the likelihood score equation.

1. Fundamentals of the Weighted Likelihood Approach

The central innovation is a modification of the maximum likelihood estimating equation such that each observation’s score contribution is multiplied by an adaptive weight. These weights depend on the compatibility of the observation with the fitted parametric model. If an observation fits well under the model, its score receives full weight (weight ≈ 1). If it appears incompatible (e.g., is an outlier), the corresponding weight is diminished. By attaching the weight to the score function of each observation—not to the likelihood itself—this approach avoids routinely downweighting observations and only penalizes those not aligning with the bulk of the data.

The general weighted score equation for parameter θ is:

i=1nH(τn,θ(Xi))uθ(Xi)=0,\sum_{i=1}^n H(\tau_{n,\theta}(X_i))\, u_\theta(X_i) = 0,

where uθ(Xi)=logfθ(Xi)u_\theta(X_i) = \nabla \log f_\theta(X_i), fθ()f_\theta(\cdot) is the model density, and H()H(\cdot) is a weight function dependent on a residual τn,θ(Xi)\tau_{n,\theta}(X_i) that measures discrepancy between the empirical and model distributions at XiX_i.

2. Construction of Residuals and Weight Functions

2.1 Residual Function

For univariate i.i.d. data, the residual τn,θ(Xi)\tau_{n, \theta}(X_i) quantifies how much the empirical distribution FnF_n deviates from the model distribution FθF_\theta at XiX_i. Specifically:

τn,θ(Xi)={Fn(Xi)/Fθ(Xi)1if 0<Fθ(Xi)p, 0if p<Fθ(Xi)<1p, Sn(Xi)/Sθ(Xi)1if 1pFθ(Xi)<1,{\displaystyle \tau_{n,\theta}(X_i) = \begin{cases} F_n(X_i)/F_\theta(X_i) - 1 & \text{if } 0 < F_\theta(X_i) \leq p, \ 0 & \text{if } p < F_\theta(X_i) < 1-p, \ S_n(X_i)/S_\theta(X_i) - 1 & \text{if } 1 - p \leq F_\theta(X_i) < 1, \end{cases} }

with Sn(x)=1Fn(x)S_n(x) = 1 - F_n(x) and Sθ(x)=1Fθ(x)S_\theta(x) = 1 - F_\theta(x). The parameter p0.5p \leq 0.5 controls the fraction of observations, in each tail, that may be downweighted.

2.2 Weight Function Specification

The weight function H(τ)H(\tau) must satisfy:

  • H(0)=1H(0) = 1 (full weight for central, well-modeled points),
  • HH is at least twice differentiable at $0$,
  • H(0)=0H'(0) = 0 and H(0)<0H''(0) < 0,
  • HH decays to zero as τ\tau \to \infty and is small at τ=1\tau = -1.

A general device is to start with a density gγ()g_\gamma(\cdot), defined on [a,)[a, \infty) with gγ(a)=0g_\gamma(a)=0 and mode at a+1a+1 (after reparametrization), then set:

Hγ0(τ)=gγ0(τ+a+1)gγ0(a+1).H_{\gamma_0}(\tau) = \frac{g_{\gamma_0}(\tau + a + 1)}{g_{\gamma_0}(a + 1)}.

Several concrete gγ()g_{\gamma}(\cdot) are used:

  • Gamma density: g1/(α1),α(x)g_{1/(\alpha-1), \alpha}(x), with λ=1/(α1)\lambda = 1/(\alpha-1), α>1\alpha > 1 so mode is at $1$.
  • Weibull, generalized extreme value (GEV), and FF-related densities.

For the gamma weight,

Hα(τ)=g1/(α1),α(τ+1)g1/(α1),α(1).H_\alpha(\tau) = \frac{g_{1/(\alpha-1), \alpha}(\tau + 1)}{g_{1/(\alpha-1), \alpha}(1)}.

As α1\alpha \searrow 1, H(τ)1H(\tau)\to 1 for all τ\tau, and weighted likelihood estimation reduces to MLE.

3. Weighted Score Equation and Estimator

The weighted likelihood estimator (WLE) is defined as the solution to:

i=1nH(τn,θ(Xi))uθ(Xi)=0.\sum_{i=1}^n H(\tau_{n,\theta}(X_i))\, u_\theta(X_i) = 0.

Each data point thus has its score function adaptively weighted by H(τn,θ(Xi))H(\tau_{n,\theta}(X_i)), responsive to its degree of model fit.

4. Theoretical Properties

4.1 Fisher Consistency

If FθF_\theta is the true data-generating model, then the estimator is Fisher consistent: T(Fθ)=θ0T(F_\theta) = \theta_0.

4.2 Influence Function

At the true model, the WLE’s influence function is the same as the MLE:

IF(y;T,Fθ)=I1(θ)uθ(y),IF(y; T, F_\theta) = I^{-1}(\theta)\, u_\theta(y),

where I(θ)I(\theta) is the Fisher information. Higher-order influence function analysis is required to reveal robustness away from the model.

4.3 Asymptotic Efficiency and Consistency

Under standard regularity conditions, θ^WLE\hat{\theta}_{WLE} is consistent, asymptotically normal, and efficient at the model:

n(θ^WLEθ0)dN(0,I1(θ0)).\sqrt{n} (\hat{\theta}_{WLE} - \theta_0) \xrightarrow{d} N(0, I^{-1}(\theta_0)).

4.4 Equivariance

The estimator is equivariant for location-scale families: if X=a+bXX^* = a + bX, then the transformation of the estimate is (a+bμ^,bσ^)(a + b\hat{\mu},\, b\hat{\sigma}).

5. Comparison to Prior Weighted Likelihood Methods

The classical weighted likelihood approach of Markatou et al. constructed weights via smoothed residuals from kernel density estimates, requiring bandwidth selection and being unsuitable for bounded support models. In contrast, the present method compares empirical and model cdfs, avoiding kernel estimation subtleties, and providing a broader class of possible weight functions (gamma, Weibull, GEV, F-based, etc.). Moreover, by ensuring H(0)=1H(0) = 1 and H(0)=0H'(0) = 0, asymptotic full efficiency is retained.

6. Empirical Studies and Real Data Evidence

Simulation studies with contaminated normal and exponential models demonstrate that, under no contamination, the WLE achieves MSE virtually identical to the MLE. Under moderate contamination (e.g., with additional high-variance or shifted mean components), WLE’s MSE remains stable while the MLE’s MSE increases significantly, showing the robustness of the new approach.

Applications to real data include:

  • Poisson regression on Drosophila data: WLE matches the outlier-deleted MLE.
  • Normal model fit to Newcomb’s speed of light data: WLE with small α\alpha closely tracks the cleaned MLE, downweighting outliers.
  • Exponential fit to Melbourne rainfall data: WLE downweights extreme right-tail observations.
  • Multivariate and regression cases: WLE delivers robust fits (e.g., robust concentration ellipses and regression lines) resistant to influential outliers.

7. Robustness, Efficiency, and Tuning

Comprehensive theoretical analysis confirms Fisher consistency, equivariance, and asymptotic normality. Robustness arises from the second-order influence function, while first-order local efficiency is preserved. Practical performance is controlled via the tuning parameters in the weight function (typical values include α=1.01\alpha = 1.01, k=1.01k = 1.01, ξ=10\xi = 10, etc.), balancing robustness against efficiency.

With H(0) = 1 and a suitable tuning, the WLE is fully efficient when the model is correct and robust under mild to moderate model violations. Compared with kernel-based weighted likelihood approaches, this method is both algorithmically simpler and theoretically more general, being deployable in a broad array of models—including location-scale, Poisson, exponential, regression, and multivariate contexts.


In sum, the weighted target estimation method introduced in this work systematically attaches a data-driven, model-adaptive weight to each observation’s score contribution. This results in estimators with strong robustness properties against outliers and contamination, while retaining the efficiency and invariance expected of the maximum likelihood estimator under correct model specification. The design of the residual-based weight function, the clean theoretical guarantees, and the empirical performance across diverse settings position this approach as a foundational tool for robust inference in statistical modeling.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Weighted Target Estimation.