Kalman Filter: Principles & Extensions

Updated 24 November 2025

Kalman filter is a recursive Bayesian estimator for linear dynamical systems under Gaussian noise.
It operates through a prediction-update cycle that optimally balances process and measurement uncertainties.
Its robust extensions, such as EKF and UKF, enable effective applications in navigation, finance, and signal processing.

The Kalman filter is a recursive Bayesian estimator for linear dynamical systems with Gaussian noise, yielding the minimum mean-square error estimate of the system state given noisy, partial observations. It is omnipresent in control, navigation, signal processing, time series analysis, and machine learning, with principled extensions to nonlinear, non-Gaussian, high-dimensional, and adversarial regimes.

1. Mathematical Foundations and Standard Algorithm

The core Kalman filter (KF) operates on a discrete-time linear-Gaussian state-space model:

State equation: $x_k = A\,x_{k-1} + B\,u_k + w_k$ , $w_k \sim \mathcal{N}(0,Q)$
Measurement equation: $z_k = H\,x_k + v_k$ , $v_k \sim \mathcal{N}(0,R)$

Key objects are:

$x_k \in \mathbb{R}^n$ : state vector
$u_k \in \mathbb{R}^m$ : deterministic control
$z_k \in \mathbb{R}^p$ : observed measurement
$A,B,H$ : system, control, and observation matrices
$Q$ , $R$ : process and measurement noise covariances

At time $k$ , denote:

$\,\hat{x}_{k|k-1}$ , $P_{k|k-1}$ : predicted state and covariance, given data until $k-1$
$\,\hat{x}_{k|k}$ , $P_{k|k}$ : posterior (“filtered”) estimates after incorporating $z_k$

The recursion consists of:

Prediction:

$\begin{aligned} \hat{x}_{k|k-1} &= A\,\hat{x}_{k-1|k-1} + B\,u_k \ P_{k|k-1} &= A\,P_{k-1|k-1}\,A^\top + Q \end{aligned}$

Update:

$\begin{aligned} y_k &= z_k - H\,\hat{x}_{k|k-1}\qquad\text{(innovation)} \ S_k &= H\,P_{k|k-1}\,H^\top + R\qquad\text{(innovation covariance)} \ K_k &= P_{k|k-1}\,H^\top S_k^{-1}\qquad\text{(Kalman gain)} \ \hat{x}_{k|k} &= \hat{x}_{k|k-1} + K_k\,y_k \ P_{k|k} &= (I - K_k\,H)\,P_{k|k-1} \end{aligned}$

The Kalman gain $K_k$ optimally balances trust between prediction and observation, adapting at each step to the relative uncertainties encoded in $P_{k|k-1}$ and $R$ (Pei et al., 2017).

2. Theoretical Optimality and Information-Theoretic Perspective

The Kalman filter produces the unique unbiased linear estimator minimizing the posterior expected mean-square error (the BLUE for the linear-Gaussian case). This optimality extends to the information-theoretic domain: the measurement update explicitly maximizes the mutual information between the state and the acquired measurement, equivalently minimizing the posterior entropy (Luo et al., 2021).

In particular, at update step $k$ , maximizing $I(x_k;z_k) = \tfrac{1}{2}\ln \frac{\det P_{k|k-1}}{\det P_{k|k}(K)}$ over the gain $K$ yields the canonical Kalman gain, and thus the Kalman update is the unique linear step that achieves maximum information contraction in the posterior covariance.

This extends to Rényi mutual information—for linear-Gaussian systems, maximizing Shannon and Rényi information yields identical gain and update rules.

3. Bayesian and Variational Interpretations

The Kalman filter is a recursive Bayesian estimator: each step computes the (Gaussian) posterior of the state given all observations so far. In modern inference, the Kalman update emerges as the steady-state of minimizing a quadratic (Gaussian) variational free-energy (VFE). The VFE functional,

$F(\hat x_t, P_t) = \frac{1}{2}\bigl[(y_t - C\hat x_t)^T R^{-1} (y_t - C\hat x_t) + (\hat x_t-A\hat x_{t-1|t-1})^T P_{t|t-1}^{-1} (\hat x_t-A\hat x_{t-1|t-1})\bigr] + \ldots$

has its stationary point precisely at the Kalman filter solution for $(\hat{x}_t,P_t)$ , with the mean update corresponding to the minimum of $F$ and the covariance update to its Hessian (Baltieri et al., 2021).

This provides a direct mechanistic link to predictive coding and active inference: in the linear case, performing gradient descent on VFE (or its neural analogues) reproduces precisely the Kalman recursion (Millidge et al., 2021, Baltieri et al., 2021).

4. Generalizations, Extensions, and Robust Variants

4.1 Extended and Nonlinear Kalman Filters

When dynamical or measurement models are nonlinear, the Extended Kalman Filter (EKF) linearizes about the current estimate via the Jacobians of $f(x,u)$ and $h(x)$ : $\begin{aligned} \hat{x}_{k|k-1} &= f(\hat{x}_{k-1|k-1}, u_k) \ F_k &= \frac{\partial f}{\partial x}|_{x=\hat{x}_{k-1|k-1},u=u_k} \ \text{rest as in linear KF, replacing %%%%27%%%% with %%%%28%%%%} \end{aligned}$ Strong nonlinearity or non-Gaussian noise can yield bias and misestimation, which motivates further generalizations. The Unscented Kalman Filter (UKF) and the Non-Linear Kalman Filter (NLKF) propagate sigma points through the nonlinear mappings to obtain consistent moment estimates without explicit linearization (Ai et al., 2021, Levy et al., 7 Mar 2025).

4.2 Robust and Outlier-Resistant Filters

When data contain heavy-tailed noise or outliers, the standard MMSE optimum is suboptimal. The Maximum Correntropy Kalman Filter (MCKF) replaces the MSE loss with a robust correntropy-based criterion, down-weighting outliers via a Gaussian kernel and requiring a fixed-point iteration at each step (Chen et al., 2015).

The Outlier-Insensitive Kalman Filter (OIKF) models each potential observation outlier as a Gaussian with unknown variance (NUV). Variances are estimated online via EM or alternating maximization, adapting the effective measurement noise to the observed residuals. When no outliers are present, the OIKF reduces exactly to the classical KF (Truzman et al., 2022).

4.3 Converted Measurement Filters

For problems with a linear state equation and nonlinear measurement mapping $h(x)$ with a bijective inverse, the generalized converted-measurement Kalman filter applies unbiased conversion and covariance mapping, followed by Kalman update in the state space. Information zeroing is used when not all measurement coordinates are observed, yielding consistent and robust estimates superior to EKF/UKF in benchmark tracking problems (Bordonaro et al., 12 Feb 2025).

4.4 Modern Bayesian Consistency and Self-Assessment

Subjective logic extends KF consistency checks (e.g., normalized innovation squared, NIS) by explicitly modeling statistical uncertainty in the consistency itself, fusing evidence counts into Dirichlet opinions and conflict measures, and is particularly valuable in online/finite-sample regimes (Griebel et al., 2020).

4.5 Real-Time Parameter and Noise Covariance Learning

Online estimation of transition matrices and noise statistics is addressed via batch least-squares in observation space, regularized and updated in real time (LOCK/LLOCK/SLOCK). These methods provide scalable alternatives to expectation-maximization (EM), well-suited for high-dimensional and streaming contexts (Ishizone et al., 2020).

5. Applications and Practical Implementation

Kalman filter theory is foundational in diverse domains:

Navigation and SLAM: Inertial navigation systems, GNSS, and visual-inertial odometry employ variants from standard KF to ESKF and IESKF, choosing error-state-based formulations for robustness and minimality in orientation parameterizations (Im, 2024).
Financial Engineering: Time-series forecasting, online trend extraction, and trading strategies use KF for near-zero-lag smoothing and superior trend detection compared to moving averages. Empirical studies show marked improvements in drawdown and Sharpe ratio when state models and covariances are well-tuned (Benhamou, 2018, Benhamou, 2018).
Filtering for SDEs: Kalman-type filters—including EKF, particle-EKF—enable sequential state estimation, parameter fitting, and closed-form marginal likelihoods for continuous-time stochastic differential equations, providing fast Bayesian parameter and boundary condition estimation in both linear and nonlinear cases (Bao et al., 2024).
Signal and Image Denoising: High-dimensional and video denoising applications exploit efficient Kalman algorithms, sometimes using median or Laplacian-based state predictors and fast diagonal covariance updates for scalable noise suppression (Khmou et al., 2013).
Neural and Biophysical Models: Kalman-like gradient flow and predictive coding models are used in neuroscience to explain how local Hebbian learning and error correction can achieve near-Kalman optimality via biologically plausible local computations (Millidge et al., 2021, Baltieri et al., 2021).

6. Historical Development and Tuning Principles

The Kalman filter represents the culmination of centuries of sequential updating from ancient astronomy through Gauss’ least-squares, Wiener’s filtering, and the state-space innovations of Kalman (1960) and Kalman–Bucy (1961) (M et al., 2015). Practical optimality relies not only on correct model structure, but also careful tuning of noise covariances, initial state/covariance, and parameters. Iterative data-driven recipes, often leveraging composite cost functions spanning prediction, filtering, and smoothing errors, are essential for achieving statistical equilibrium and CRB-level performance in both simulation and real-data settings.

Algorithm	Nonlinearity	Robustness to Outliers	Optimality Criterion
KF	Linear	Low	MMSE (Gaussian)
EKF	Weak-nonlinear	Low	Locally linearized MMSE
UKF/NLKF	Strong-nonlinear	Low/Med	Unscented/cubature
OIKF	Linear	High	Outlier-insensitive (NUV)
MCKF	Linear	High	Maximum correntropy
PKF (Converted)	Linear/Nonlinear Meas.	High	State-space conversion

7. Contemporary Research and Future Directions

Ongoing research extends the Kalman paradigm in several dimensions:

Nonlinear/non-Gaussian filters: Feedback particle filters and ensemble Kalman schemes generalize the update structure to infinite-dimensional, nonlinear systems, leveraging measure-coupling and optimal transport (Taghvaei et al., 2017).
Adaptive neural filters: End-to-end networks estimate time-varying process noise, e.g., via ProcessNet, improving robustness under model mismatch and sensor drift (Levy et al., 7 Mar 2025).
Streaming, scalable, and real-time inference: New algorithms address online parameter learning and high-dimensional estimation without reliance on batch processing or smoothing (Ishizone et al., 2020).
Information geometry: The natural gradient and free-energy principle offer coordinate-invariant unifications of filtering, learning, and inference (Ollivier, 2019, Baltieri et al., 2021).

Progress continues apace to handle multimodal uncertainties, hybrid discrete-continuous states, adversarial environments, and complex sensor fusion, ensuring the Kalman filter, in its many variants, remains central to state estimation across science and engineering.