Papers
Topics
Authors
Recent
Search
2000 character limit reached

Weighted Moving Average (WMA) Overview

Updated 5 April 2026
  • Weighted Moving Average (WMA) is a linear filter that applies non-uniform, normalized weights across a time window to emphasize recent observations and reduce noise.
  • It encompasses various formulations—including linear, exponential, and adaptive schemes—designed to optimize performance based on data autocorrelation and error minimization.
  • Practical applications include financial forecasting, fault detection in industrial processes, and adaptive online learning, although some variants may incur high computational costs.

A Weighted Moving Average (WMA) is a family of linear filters widely used for smoothing time series, detecting structural changes, signal denoising, and adaptive prediction. WMAs generalize the classical moving average by allowing non-uniform, normalized weights over a window or history, enabling explicit control over the influence of recent versus older observations. The rigorous formulation of WMA encompasses a spectrum of protocols, from fixed linear or geometrically decaying weights to data-adaptive windows determined by optimization criteria such as mean-squared error, detectability under autocorrelation, or stability under nonstationarity.

1. Mathematical Formulation and Basic Variants

The generic WMA of a sequence {xt}\{x_t\} at time tt is

WMAt=i=1nwixtn+i,i=1nwi=1,\mathrm{WMA}_t = \sum_{i=1}^{n} w_i\, x_{t-n+i}, \quad \sum_{i=1}^{n} w_i = 1,

where nn is the window length and wRnw \in \mathbb{R}^n is a nonnegative weight vector.

Common instantiations include:

  • Linear/arithmetic WMA (used in Mov-Avg (Weichbroth et al., 2024)): wi=i/[n(n+1)/2]w_i = i / [n(n+1)/2], so newer observations carry greater weight; specifically,

WMAt=1xtn+1++nxtn(n+1)/2\mathrm{WMA}_t = \frac{1\cdot x_{t-n+1} + \cdots + n\cdot x_t}{n(n+1)/2}

  • Exponentially Weighted Moving Average (EWMA): Recursively defined as yt=λxt+(1λ)yt1y_t = \lambda\,x_t + (1-\lambda) y_{t-1}, yielding a WMA with weights λ(1λ)i1\lambda(1-\lambda)^{i-1} on xti+1x_{t-i+1} (for tt0) (Shaira et al., 14 Dec 2025).
  • Generally Weighted Moving Average (GWMA): Arbitrary normalized weights, often parameterized for flexibility:

tt1

with tt2 and tt3 (Knoth et al., 2021).

Custom or adaptive WMA schemes allow for user-specified or data-driven weight vectors as long as normalization is enforced.

2. Optimal Weight Window Design: Quadratic Programming and Convex Geometry

A principled approach to WMA smoothing is to select the weights tt4 that minimize the total squared error between the smoothed series tt5 and the original series tt6:

tt7

This yields a quadratic programming (QP) problem (Gokcesu et al., 2023):

tt8

subject to tt9, where WMAt=i=1nwixtn+i,i=1nwi=1,\mathrm{WMA}_t = \sum_{i=1}^{n} w_i\, x_{t-n+i}, \quad \sum_{i=1}^{n} w_i = 1,0 incorporates constraints such as nonnegativity, normalization, symmetry (WMAt=i=1nwixtn+i,i=1nwi=1,\mathrm{WMA}_t = \sum_{i=1}^{n} w_i\, x_{t-n+i}, \quad \sum_{i=1}^{n} w_i = 1,1), and tapering (WMAt=i=1nwixtn+i,i=1nwi=1,\mathrm{WMA}_t = \sum_{i=1}^{n} w_i\, x_{t-n+i}, \quad \sum_{i=1}^{n} w_i = 1,2). The matrix WMAt=i=1nwixtn+i,i=1nwi=1,\mathrm{WMA}_t = \sum_{i=1}^{n} w_i\, x_{t-n+i}, \quad \sum_{i=1}^{n} w_i = 1,3 captures the sample autocorrelation structure; WMAt=i=1nwixtn+i,i=1nwi=1,\mathrm{WMA}_t = \sum_{i=1}^{n} w_i\, x_{t-n+i}, \quad \sum_{i=1}^{n} w_i = 1,4 encodes cross-data correlations.

The central theoretical result is that this optimization is equivalent to a projection of the origin onto a convex polytope defined by the feasible set of admissible weight sequences:

WMAt=i=1nwixtn+i,i=1nwi=1,\mathrm{WMA}_t = \sum_{i=1}^{n} w_i\, x_{t-n+i}, \quad \sum_{i=1}^{n} w_i = 1,5

where WMAt=i=1nwixtn+i,i=1nwi=1,\mathrm{WMA}_t = \sum_{i=1}^{n} w_i\, x_{t-n+i}, \quad \sum_{i=1}^{n} w_i = 1,6 is the convex hull of the columns of an appropriately constructed matrix WMAt=i=1nwixtn+i,i=1nwi=1,\mathrm{WMA}_t = \sum_{i=1}^{n} w_i\, x_{t-n+i}, \quad \sum_{i=1}^{n} w_i = 1,7 (Gokcesu et al., 2023).

Analytic solutions are available when the unconstrained QP optimum lies within the simplex; otherwise, iterative convex optimization algorithms (e.g., Wolfe’s minimum-norm-point, active-set methods) are required.

3. Adaptive and Optimal WMA in Autocorrelated and Multivariate Settings

Classical equal-weight (MA) and exponential-weight (EWMA) smoothers are optimal only under i.i.d. or memoryless conditions. For weakly stationary processes with autocorrelation, optimality requires full exploitation of the covariance (or autocorrelation) structure.

The optimally weighted moving average (OWMA) selects WMAt=i=1nwixtn+i,i=1nwi=1,\mathrm{WMA}_t = \sum_{i=1}^{n} w_i\, x_{t-n+i}, \quad \sum_{i=1}^{n} w_i = 1,8, subject to WMAt=i=1nwixtn+i,i=1nwi=1,\mathrm{WMA}_t = \sum_{i=1}^{n} w_i\, x_{t-n+i}, \quad \sum_{i=1}^{n} w_i = 1,9 for a given Toeplitz covariance matrix nn0 (Zhao et al., 2020, Zhao et al., 2020). The solution is unique, and, due to the properties of nn1, the optimal window is symmetric. In the absence of autocorrelation, the solution reduces to the uniform MA window. This adaptivity is critical in applications such as intermittent fault detection in multivariate processes, where Hotelling nn2 statistics are paired with OWMA for increased sensitivity and lower false-alarm rates (Zhao et al., 2020).

Key results:

  • Existence and uniqueness of optimal weights follow from strong convexity (positive-definite nn3) and normalization constraints.
  • The optimal window is symmetric: nn4
  • Detectability of changepoints or faults depends on the relationship between window size, autocorrelation, and the structure of the fault.

4. Algorithmic and Computational Considerations

Implementational details differ by WMA protocol:

  • Linear WMA (e.g., Mov-Avg): Compute and normalize weights; convolve with the series. Complexity is nn5 for a series of length nn6 and window size nn7 (Weichbroth et al., 2024).
  • Exponentially Weighted MA (EWMA): Recursively updatable in nn8 time per step. No growing memory requirement since older data decay exponentially and are not stored (Shaira et al., 14 Dec 2025).
  • GWMA: No recursive update; each new WMA statistic requires recomputation across the entire data history and storage of all previous observations. Computational cost is nn9 for statistic at time wRnw \in \mathbb{R}^n0 (Knoth et al., 2021).
  • OWMA in presence of autocorrelation: Requires pre-estimation of the autocovariance matrix and a convex program solution (closed form in univariate, simple cases; nonlinear equations with fixed-point existence for multivariate, autocorrelated data).

A summary of algorithmic properties appears below.

Method Update Memory Window Type
Linear WMA Sliding wRnw \in \mathbb{R}^n1 Hard cutoff, linear
EWMA Recursive wRnw \in \mathbb{R}^n2 Infinite, geometric
GWMA Summation wRnw \in \mathbb{R}^n3 Flexible, non-rec.
OWMA Optimized wRnw \in \mathbb{R}^n4 Data-adaptive

5. Theoretical and Structural Properties

Several theoretical properties derive from the QP formulation and covariance optimization:

  • Symmetry: Optimal WMA weight windows are always symmetric due to the convexity of the feasible set and autocorrelation operator invariance (Gokcesu et al., 2023).
  • Tapering (Monotonicity): Imposing weight monotonicity ensures that influence decays away from the window center, leading to “well-behaved” kernels (Bartlett, Hanning shapes as special cases).
  • Equally Weighted MA as a Limiting Case: For uncorrelated data, OWMA windows reduce to uniform weight vectors (Zhao et al., 2020, Zhao et al., 2020).
  • Filter Design Implications: Projection-onto-polytope frameworks afford extension to additional constraints such as frequency shaping or sidelobe suppression, crucial in signal processing and control (Gokcesu et al., 2023).

Empirical studies confirm that data-adaptive or optimally-designed WMA windows yield superior detection, isolation, and smoothing performance when underlying signals are temporally correlated, especially in fault detection or real-time diagnosis (Zhao et al., 2020, Zhao et al., 2020).

6. Practical Applications and Limitations

WMAs are employed across disciplines:

  • Time Series Forecasting and Trend Detection: Financial analytics (price crossovers), climate series smoothing, demand forecasting, often with linearly or custom-weighted WMAs (Weichbroth et al., 2024).
  • Industrial Process Control and Fault Detection: Applications include detection of intermittent faults or over-creeps in electric units. Here, OWMA produces more robust detection indices by aligning weights with correlation structure (Zhao et al., 2020).
  • Adaptive Online Learning and Concept Drift: EWMA- and WMA-based schemes serve in adaptive classifiers (see OLC-WA), blending batch and online statistics for real-time response to nonstationarity (Shaira et al., 14 Dec 2025).
  • Limitations: Basic WMA (with fixed or parameterized weights) lacks Markovian structure except in special cases (EWMA); general GWMA protocols have high storage/computation demands and are not amenable to analytic control limit derivation (Knoth et al., 2021). WMAs can overweight recent outliers, and in large windows, older data continue to influence the result.

7. Comparative Analysis, Controversies, and Extensions

Research demonstrates that while GWMA offers maximal flexibility, it lacks efficient recursive updates, transparency, or closed-form operational characteristics; EWMA provides near-optimal smoothing with analytic tractability and efficient memory usage in most change-detection scenarios (Knoth et al., 2021). Adaptive covariance-informed WMA (OWMA) protocols outperform both MA and EWMA for autocorrelated and multivariate settings at modest overhead (Zhao et al., 2020).

Current and emergent research directions focus on:

  • Extending WMA frameworks to impose frequency-domain or robustness constraints via polytope geometry (Gokcesu et al., 2023).
  • Exploring non-linear or non-convex extensions (e.g., minimum/maximum operators) for specialized diagnostics (Zhao et al., 2020).
  • Integrating WMA smoothing as an adaptive module within larger machine learning algorithms for concept-drift-aware models in streaming environments (Shaira et al., 14 Dec 2025).

Critically, no single weight design universally minimizes both lag and spurious sensitivity; WMA design remains problem-dependent, and informed constraint selection or data-adaptive methods are recommended for optimal performance across scientific and engineering domains.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Weighted Moving Average (WMA).