Nonlinear Extended Kalman Filter (EKF)

Updated 17 May 2026

Nonlinear EKF is a recursive Bayesian estimator that linearizes nonlinear state transition and observation models for efficient estimation in systems like robotics and navigation.
Robust performance relies on careful tuning of process and measurement noise covariances, initialization, and adaptive techniques such as innovation saturation.
Recent extensions include geometric, infinite-dimensional, and derivative-free variations, broadening the EKF's applicability to complex and non-Euclidean systems.

The nonlinear Extended Kalman Filter (EKF) is a recursive Bayesian estimator engineered to accommodate nonlinear process and measurement models in discrete- or continuous-time stochastic systems. Unlike its linear counterpart, the EKF relies on linearization of nonlinear system dynamics and observation functions about current state estimates, enabling efficient state and covariance propagation for a wide array of real-world systems in robotics, navigation, signal processing, and control. Variants and geometrically consistent extensions further expand its reach to infinite-dimensional systems, manifolds, and situations with unknown or time-varying noise profiles.

1. Mathematical Foundations and Standard Discrete EKF

Let $x_k \in \mathbb{R}^n$ denote the latent state at discrete time $k$ . The general nonlinear stochastic state-space model is: $\begin{aligned} x_k &= f(x_{k-1}, u_{k-1}) + w_{k-1}, \quad w_{k-1} \sim \mathcal{N}(0, Q_{k-1})\ z_k &= h(x_k) + v_k, \qquad \qquad v_k \sim \mathcal{N}(0, R_k) \end{aligned}$ Here $f: \mathbb{R}^n \to \mathbb{R}^n$ and $h: \mathbb{R}^n \to \mathbb{R}^m$ are possibly nonlinear state transition and measurement maps; $Q_{k-1}$ , $R_k$ are process and observation noise covariances. The EKF recursively estimates the filtered mean $\hat{x}_{k|k}$ and covariance $P_{k|k}$ . The core prediction-update cycle is:

Prediction:

$\begin{aligned} \hat{x}_{k|k-1} &= f(\hat{x}_{k-1|k-1}, u_{k-1})\ F_k &= \left.\frac{\partial f}{\partial x}\right|_{x=\hat{x}_{k-1|k-1}, u_{k-1}}\ P_{k|k-1} &= F_k\, P_{k-1|k-1}\, F_k^\top + Q_{k-1} \end{aligned}$

Update:

$k$ 0

This formulation is documented in multiple domains, including eye-gaze tracking (Thieu et al., 17 Apr 2025), quadrotor navigation (Tellex et al., 2018), and competition robotics (Kou et al., 2023). The EKF assumes all linearizations, covariances, and noise statistics are available or can be reliably estimated at each step, and the filter remains effective provided the underlying system is “sufficiently close” to linear in the local neighborhood of the state estimates.

2. Practical Implementations and Tuning

Parameter selection and initialization are critical for robust EKF deployment:

Initialization: $k$ 1 is set using the initial measurement, and $k$ 2 is assigned as a large diagonal to encode high initial uncertainty (Thieu et al., 17 Apr 2025, Tellex et al., 2018).
Process and Measurement Covariances: $k$ 3, $k$ 4 directly determine filter responsiveness and noise rejection. Tuning $k$ 5 and $k$ 6 modulates the filter’s trust in the process model versus observed data (“filter rigidity”); larger $k$ 7 induces more smoothing but can underweight informative measurements, while small $k$ 8 constrains the filter to the dynamical model (Thieu et al., 17 Apr 2025).
Adaptive Noise Estimation: Techniques such as the ROSE-Filter dynamically estimate $k$ 9 based on recent measurement residuals using a forgetting factor and auxiliary 1D Kalman subfilters, improving robustness under nonstationary noise conditions (Marchthaler, 2021).

Application-specific considerations—such as inflating $\begin{aligned} x_k &= f(x_{k-1}, u_{k-1}) + w_{k-1}, \quad w_{k-1} \sim \mathcal{N}(0, Q_{k-1})\ z_k &= h(x_k) + v_k, \qquad \qquad v_k \sim \mathcal{N}(0, R_k) \end{aligned}$ 0 during sensor outages (e.g., eye blinks), outlier clipping for grossly erroneous observations, or covariance resetting to address numerical round-off—are critical for performance in real systems (Thieu et al., 17 Apr 2025, Tellex et al., 2018).

3. Theoretical Guarantees, Extensions, and Robustness

The EKF’s foundational theory guarantees local optimality (minimum mean-squared error) only under idealized assumptions: mild nonlinearity, Gaussian noise, and sufficiently accurate linearization. Under small, state-dependent noise, convergence rates and error contraction can be rigorously quantified: in continuous time, assuming strong injectivity of the observation drift and near-linearity as $\begin{aligned} x_k &= f(x_{k-1}, u_{k-1}) + w_{k-1}, \quad w_{k-1} \sim \mathcal{N}(0, Q_{k-1})\ z_k &= h(x_k) + v_k, \qquad \qquad v_k \sim \mathcal{N}(0, R_k) \end{aligned}$ 1, the expected estimation error scales as $\begin{aligned} x_k &= f(x_{k-1}, u_{k-1}) + w_{k-1}, \quad w_{k-1} \sim \mathcal{N}(0, Q_{k-1})\ z_k &= h(x_k) + v_k, \qquad \qquad v_k \sim \mathcal{N}(0, R_k) \end{aligned}$ 2, with exponential forgetting of initial error contingent on exponential stability of the linearized drift (Njiasse et al., 13 Nov 2025). This quantifies practical filter accuracy in the vanishing noise regime.

Robustness to process/measurement outliers or malicious spikes can be achieved via innovation saturation mechanisms, which clamp the innovation vector using an adaptively updated threshold $\begin{aligned} x_k &= f(x_{k-1}, u_{k-1}) + w_{k-1}, \quad w_{k-1} \sim \mathcal{N}(0, Q_{k-1})\ z_k &= h(x_k) + v_k, \qquad \qquad v_k \sim \mathcal{N}(0, R_k) \end{aligned}$ 3 based on recent innovation magnitude (Fang et al., 2019). Bounded estimation error is then provable even in the presence of large, structured measurement outliers, provided certain stabilizability and detectability conditions on the system are met.

4. Geometric, Infinite-Dimensional, and Data-Driven Generalizations

Nonlinear EKFs have been extended in several profound directions:

Geometric EKF on Manifolds: For systems evolving on Riemannian manifolds (e.g., $\begin{aligned} x_k &= f(x_{k-1}, u_{k-1}) + w_{k-1}, \quad w_{k-1} \sim \mathcal{N}(0, Q_{k-1})\ z_k &= h(x_k) + v_k, \qquad \qquad v_k \sim \mathcal{N}(0, R_k) \end{aligned}$ 4, configuration spaces of pose and orientation), coordinate-dependent linearizations suffer from inconsistency and poor covariance propagation. Intrinsic EKF formulations employ affine connections, exponential maps, and parallel transport to define the propagation and update in normal coordinates, ensuring consistency under chart changes and providing rigorous geometric covariance resets (Ge et al., 6 Jun 2025, Huai et al., 2023).
Infinite-Dimensional EKF: For PDE- or delay-based systems (Hilbert-space state), the EKF is defined with Fréchet derivatives, operator-valued Riccati equations, and evolution operators, with local exponential stability obtainable under detectability and Lipschitz assumptions (Afshar et al., 2022). For infinite-dimensional measurements (e.g., image-valued outputs), the measurement Jacobian at each pixel is directly linked to image gradients, with all algorithmic steps recast via integral and operator calculus (Varley et al., 23 Sep 2025).
Derivative-Free and Data-Driven EKFs: When analytic Jacobians are unavailable or unreliable (non-differentiable $\begin{aligned} x_k &= f(x_{k-1}, u_{k-1}) + w_{k-1}, \quad w_{k-1} \sim \mathcal{N}(0, Q_{k-1})\ z_k &= h(x_k) + v_k, \qquad \qquad v_k \sim \mathcal{N}(0, R_k) \end{aligned}$ 5, $\begin{aligned} x_k &= f(x_{k-1}, u_{k-1}) + w_{k-1}, \quad w_{k-1} \sim \mathcal{N}(0, Q_{k-1})\ z_k &= h(x_k) + v_k, \qquad \qquad v_k \sim \mathcal{N}(0, R_k) \end{aligned}$ 6), sample-vector-based EKFs (“derivative-free” filters) use sigma-point approximations and moment differential equations, permitting robust filtering with only calls to $\begin{aligned} x_k &= f(x_{k-1}, u_{k-1}) + w_{k-1}, \quad w_{k-1} \sim \mathcal{N}(0, Q_{k-1})\ z_k &= h(x_k) + v_k, \qquad \qquad v_k \sim \mathcal{N}(0, R_k) \end{aligned}$ 7, $\begin{aligned} x_k &= f(x_{k-1}, u_{k-1}) + w_{k-1}, \quad w_{k-1} \sim \mathcal{N}(0, Q_{k-1})\ z_k &= h(x_k) + v_k, \qquad \qquad v_k \sim \mathcal{N}(0, R_k) \end{aligned}$ 8—no derivatives (Kulikova et al., 2024, Kulikova et al., 2024). Data-driven strategies such as SINDy-augmented EKFs enable the identification and real-time filtering of nonlinear dynamical systems with joint state and parameter estimation, leveraging sparse regression for minimal model representation and online uncertainty quantification (Rosafalco et al., 2024).

5. Covariance Compensation, Observability, and Modern Correction Principles

Recent analyses reveal that the classical EKF tends to underestimate posterior covariance under strong nonlinearity, especially in the measurement model. Covariance compensation provides a rigorous framework for improving nonlinear KF designs:

Covariance Compensation: Any nonlinear KF can be viewed as a covariance-compensated EKF, adjusting the innovation covariance via a positive-semidefinite correction $\begin{aligned} x_k &= f(x_{k-1}, u_{k-1}) + w_{k-1}, \quad w_{k-1} \sim \mathcal{N}(0, Q_{k-1})\ z_k &= h(x_k) + v_k, \qquad \qquad v_k \sim \mathcal{N}(0, R_k) \end{aligned}$ 9. Three guidelines ensure robust performance: invariance under orthogonal transformations (preserve symmetry), sufficient compensation exceeding the EKF in purely quadratic scenarios, and a preference for overestimated (underconfident) covariances (Jiang et al., 24 Mar 2026).
Recalibration and Back-Out: A new framework reuses the post-update estimate to recalibrate the measurement Jacobian, forming a more accurate posterior covariance and adding a “back-out” safeguard if the newly computed covariance trace increases, yielding order-of-magnitude accuracy improvements in strongly nonlinear regimes (Jiang et al., 2024).
Observation-Centered EKF: For highly nonlinear observations with small $f: \mathbb{R}^n \to \mathbb{R}^n$ 0, linearizing not around the predicted mean but around the “observation-centered” state $f: \mathbb{R}^n \to \mathbb{R}^n$ 1 yields drastic gains in posterior consistency and estimation accuracy (Kent et al., 2019).

6. Empirical Performance and Applications

Contemporary studies demonstrate the utility of nonlinear EKFs:

Eye-Gaze Smoothing: EKF achieves RMSE reductions of ~2x on synthetic data and over 50x on noisy real eye-tracking data versus simple smoothers, supporting its role in trajectory estimation and denoising of human and robotic motion (Thieu et al., 17 Apr 2025).
Mobile Robotics and Navigation: In mobile robot localization, fusion of odometry, model prediction, and intermittent vision is efficiently realized in a full-state EKF, yielding substantial RMSE improvements and robust performance under changing sensor availabilities (Kou et al., 2023, Tellex et al., 2018).
Vision-Based Localization with Infinite-Dimensional Measurements: In drone state estimation from monocular imagery, the infinite-dimensional EKF (treating image measurements as random fields and leveraging image gradients) outperformed VINS-MONO by up to an order of magnitude in position MSE for multiple trajectories (Varley et al., 23 Sep 2025).
Adaptive Covariance and Outlier Rejection: ROSE-Filter adaptive EKF outperforms classical EKF in dynamic noise scenarios, while innovation-saturated EKF (IS-EKF) provides provably bounded error in the presence of strong measurement outliers in mobile robot testbeds (Fang et al., 2019, Marchthaler, 2021).

7. Algorithmic Variants and Implementation Tradeoffs

EKF flexibility is reflected in a spectrum of algorithmic variants:

Standard, Iterated, and Observation-Centered EKF: Standard EKF uses a single linearization per update; iterated EKF refines the update through repeated local linearizations, attaining superior accuracy for highly nonlinear $f: \mathbb{R}^n \to \mathbb{R}^n$ 2 (Huai et al., 2023, Kent et al., 2019). The OCEKF is preferable under highly nonlinear measurements and small $f: \mathbb{R}^n \to \mathbb{R}^n$ 3.
Geometric and Manifold EKF: On manifolds, proper handling of geometric structure (affine connections, covariance retraction) reduces transient and steady-state RMSE in inertial navigation by up to 44% (Ge et al., 6 Jun 2025).
Derivative-Free and Square-Root EKF: Derivative-free EKFs working via sample vectors or sigma points—especially with square-root updates (Cholesky or SVD)—offer robust stability against round-off and ill-conditioning, with computational cost scaling as $f: \mathbb{R}^n \to \mathbb{R}^n$ 4 per update (Kulikova et al., 2024, Kulikova et al., 2024).
Filter Tuning and Validation: Guidelines for selecting $f: \mathbb{R}^n \to \mathbb{R}^n$ 5, $f: \mathbb{R}^n \to \mathbb{R}^n$ 6, sample sizes, and model structure are available; on-line Normalized Innovation Squared (NIS) statistics support adaptive compensation updates (Jiang et al., 24 Mar 2026).

Correct filter selection, tuning, and adaptation to application-specific structure (e.g., outlier saturation, noise adaptation, geometric consistency) are vital for maintaining EKF accuracy, consistency, and convergence across nonlinear, high-dimensional, or non-Euclidean inference problems.