Optimal Variance Filtering (OVF)

Updated 24 December 2025

Optimal Variance Filtering (OVF) is a framework that minimizes the conditional mean square error using Bayesian recursions, control-theoretic duality, and variational methods.
It is applied in stochastic filtering, signal processing, and reinforcement learning to provide robust state estimation with assured stability.
OVF integrates classical solutions like the Kalman filter with modern data-driven techniques, offering clear bias–variance tradeoffs and theoretical optimality guarantees.

Optimal Variance Filtering (OVF) refers to a spectrum of techniques and principles for constructing estimators or filters that optimize (minimize) the variance of their outputs—often in the sense of minimum mean square error (MMSE)—across various domains, including stochastic filtering, adaptive signal processing, control, and machine learning. OVF approaches range from classical Bayesian and control-theoretic formalisms, through optimization and variational methods, to modern applications in large-scale neural generative models and adaptive filtering.

1. Core Concepts and Mathematical Foundation

At its core, OVF applies to the problem of estimating a hidden signal process $\{S_n\}$ or state $x_t$ from noisy or incomplete observations ( $\{X_n\}$ , $y_t$ ) subject to stochastic uncertainty. The defining property of OVF is that the estimator minimizes the variance (or mean square error) of the estimate, typically realized as the conditional expectation $\widehat{S}_n = \mathbb{E}[S_n | x_{1:n}]$ or its nonlinear extensions (Markovich, 2014).

In the setting of partially observed Markov processes, under only weak assumptions such as Markovianity and known likelihood family, Dobrovidov's equation provides a universal filtering equation:

$w_n(s_n|x_{1}^n) = \frac{ f(x_n|s_n)\int p(s_n|s_{n-1})w_{n-1}(s_{n-1}|x_{1}^{n-1})ds_{n-1} }{ f(x_n|x_{1}^{n-1}) }$

with the MMSE estimator

$\widehat S_n = \mathbb{E}[S_n | x_{1}^n] = \int s_n\,w_n(s_n|x_{1}^n)ds_n$

This Bayesian recursion is distribution-free except for requiring the observation likelihood $f(x|s)$ to admit tractable integrals.

2. Classical Linear-Quadratic-Gaussian and Control Duality

Specializing to linear-Gaussian state-space models, OVF recovers the celebrated Kalman filter as a closed-form solution:

State and observation equations: $S_n = a S_{n-1} + \xi_n, \qquad X_n = A S_n + B \eta_n$ with $\xi_n$ , $\eta_n$ iid zero-mean, unit-variance Gaussians.

Kalman update: $\begin{aligned} \hat S_{n|n} &= \hat S_{n|n-1} + K_n \left[x_n - A\hat S_{n|n-1}\right], \ K_n &= \frac{A\,P_{n|n-1}}{A^2 P_{n|n-1} + B^2}, \ P_{n|n} &= (1 - K_nA) P_{n|n-1} \end{aligned}$

This recursion provides the minimum-variance unbiased estimator for $S_n$ given observations $x_1^n$ , and is the unique solution to both a Bayesian filtering formulation and an optimal control dual via minimum-variance tracking (Markovich, 2014, Kim, 2022).

In nonlinear and general HMMs with continuous-time observations, duality expresses filtering as a (backward) stochastic control problem via a backward SDE constraint, with minimum-variance cost:

$J(U) = \mathbb{E}\left[ | Y_0(X_0) - \mu(Y_0) |^2 + \int_0^T \left( \Gamma Y_t(X_t) + |U_t + V_t(X_t)|^2 \right)dt \right]$

Where optimality recovers the nonlinear (Kushner-Stratonovich) filter (Kim, 2022).

3. Optimization, Learning, and Variational Frameworks

Modern OVF extends to data-driven and variational paradigms. In high-dimensional and nonlinear systems, the optimal filter cannot be computed analytically, necessitating either gradient-based learning or variational inference.

Policy Optimization Formulation:

Given a linear system with unknown noise covariances,

$x_{t+1} = A x_t + w_t, \quad y_t = C x_t + v_t$

with $w_t \sim \mathcal{N}(0, Q)$ , $v_t \sim \mathcal{N}(0, R)$ , define the optimal filter as the gain $K$ minimizing the mean squared prediction error: $J(K) = \lim_{T \to \infty} \mathbb{E}\|y_{1:T} - \hat y_{1:T}(K)\|^2$ where $\hat y_t = C \hat x_t$ , $\hat x_{t+1} = (A-KC)\hat x_t + K y_t$ .

Stochastic policy optimization (SGD on $J(K)$ ) converges locally to the unique optimal gain, with error bias/variance scaling governed by the Polyak–Łojasiewicz property and trajectory length $T$ (Talebi et al., 2023).

Variational Inference for Filtering:

Instead of analytic solutions, parameterize the analysis map $M_\theta$ (e.g., as an affine map, neural network, or EnKF variant), and optimize the evidence lower bound (ELBO): $\mathcal{L}[\theta] = \sum_{j=0}^{J-1} \mathbb{E}_{v \sim q_{j+1}^\theta}[ \log p(y_{j+1}|v) ] - D_{\mathrm{KL}}( q_{j+1}^\theta \| P q_j^\theta )$ Closed-form recovery of the Kalman gain is possible in the linear-Gaussian case; in general, this framework enables data-driven learning of nonlinear, state-dependent, or ensemble-based filters (Bach et al., 2024).

4. OVF in Signal and Image Processing: Space-Variant and Segmentation Filters

Space-variant OVF designs address non-stationarity in signals such as images or heteroscedastic time series:

Space-variant variance equalization: For an image $f(p)$ , with local variance $\sigma^2(p)$ and target $\sigma_T^2$ , the variance-reduction ratio $R(p) = \sigma^2(p)/\sigma_T^2$ defines local filtering strength. OVF recursively applies atomic kernels with analytically controlled variance reduction power, enabling high-accuracy variance normalization or edge-preserving smoothing with minimal kernel size (Zamyatin, 2019).
Piecewise-constant variance segmentation: In time-series modeling, OVF techniques formulate variance change-point detection as an $\ell_1$ -fused-lasso optimization: $\min_{x_t\ge0} \tfrac{1}{2} \sum_{t=1}^N (y_t^2 - x_t)^2 + \lambda \sum_{t=2}^N |x_t - x_{t-1}|$ Efficient $O(N)$ -per-iteration solvers (proximal/FISTA/ADMM) recover jump loci and segment variances exactly under identifiable conditions (Wahlberg et al., 2011).

5. Optimal Variance Filtering in Generative Modeling: Reward Variance Maximization

In the context of large-scale RL-based generative model alignment, OVF is used as a trajectory-selection heuristic to maximize the learning signal:

OVF Selection Principle: Given $G$ sampled trajectories $\{R_i\}$ , OVF constructs the subset $\mathcal{K}^*$ of $k<G$ with maximal reward variance: $\mathcal{K}^* = \arg\max_{|\mathcal{K}|=k} \frac{1}{k}\sum_{i\in\mathcal{K}} (R_i - \mu_\mathcal{K})^2$ This combats reward clustering and ensures that retained trajectories drive stronger normalized gradients in algorithms like Group Relative Policy Optimization (GRPO).
Dynamic Integration (Pro-GRPO): OVF is incorporated as an in-process, multi-stage pruning scheme, selecting high-variance latent trajectories at intermediate synthesis checkpoints based on proxy rewards. This enables expansion/pruning schedules that achieve state-of-the-art RLHF-style alignment with reduced computational costs (Ge et al., 17 Dec 2025).

Domain	OVF Objective	Key Formulation / Mechanism
Bayesian filtering	Minimize conditional MSE	Kalman/Dobrovidov's update; optimal control
Variational/learning	Optimize prediction/ELBO error	Policy SGD; variational inference, ELBO
Image/signal proc.	Normalize/segment local variance	Atomic kernels, fused-lasso segmentation
RL/generation	Maximize reward signal variance	High-variance trajectory subset selection

OVF encompasses and generalizes several foundational filtering paradigms:

Kalman filter: MMSE-optimal for linear-Gaussian state-space models; realized as a particular OVF instance (Markovich, 2014, Bach et al., 2024).
Wiener filter: Frequency-domain minimum-variance estimator for stationary processes; OVF formulates the Wiener solution as the frequency-domain optimal filter (Lenoir, 2013).
Maximally flat IIR filter design: OVF produces stable IIR filters with minimal white-noise gain, subject to flatness and group-delay constraints, achieving performance comparable to Kalman filters but with explicit phase/bandwidth control, especially advantageous in oversampled, embedded, or uncertain-model regimes (Kennedy, 2021).

7. Theoretical Optimality, Guarantees, and Stability

All OVF frameworks are defined by exact or approximate minimization of some variance criterion, often in MMSE sense. Theoretical guarantees include:

Optimality: For linear-Gaussian models, OVF achieves the unique MMSE/variance-minimizing estimator; for general HMMs, OVF returns the Bayes-optimal conditional mean (Markovich, 2014).
Control duality: Minimum-variance filtering is the dual of an optimal stochastic control problem, characterized by a backward SDE. Controllability of the dual yields observability and filter stability (Kim, 2022).
Stability: Stability of the dual BSDE is necessary and sufficient for filter stability, with rate bounds expressible via conditional Poincaré inequalities (Kim, 2022).
Bias–variance tradeoff: For learned or stochastic optimization-based OVF, filter error admits explicit bias–variance decompositions, with sample and horizon complexity dictated by system dynamics (Talebi et al., 2023).

References

Bayesian filtering and minimal-variance recursion: (Markovich, 2014)
Duality and control-theoretic analysis: (Kim, 2022)
Data-driven and variational learning for optimal filters: (Talebi et al., 2023, Bach et al., 2024)
Space-variant and segmentation filters for nonstationary noise: (Zamyatin, 2019, Wahlberg et al., 2011)
Least squares and frequency-domain matched-filtering: (Lenoir, 2013)
Maximally flat IIR and white-noise gain minimization: (Kennedy, 2021)
Reward-variance maximizing selection in generative RL: (Ge et al., 17 Dec 2025)