A Variational Expectation-Maximisation Algorithm for Learning Jump Markov Linear Systems

Published 18 Apr 2020 in stat.AP, cs.SY, and eess.SY | (2004.08564v1)

Abstract: Jump Markov linear systems (JMLS) are a useful class which can be used to model processes which exhibit random changes in behavior during operation. This paper presents a numerically stable method for learning the parameters of jump Markov linear systems using the expectation-maximisation (EM) approach. The solution provided herein is a deterministic algorithm, and is not a Monte Carlo based technique. As a result, simulations show that when compared to alternative approaches, a more likely set of system parameters can be found within a fixed computation time, which better explain the observations of the system.

Abstract PDF Upgrade to Chat

Citations (2)

View on Semantic Scholar

Summary

The paper presents a deterministic variational EM algorithm that bypasses Monte Carlo methods to improve parameter estimation in Jump Markov Linear Systems.
It employs a two-filter smoother and Kullback-Leibler reduction to manage the computational complexity of high-dimensional state spaces.
The approach demonstrates significant improvements in convergence, numerical stability, and scalability compared to traditional particle filter methods.

Introduction

The paper "A Variational Expectation-Maximisation Algorithm for Learning Jump Markov Linear Systems" (2004.08564) presents an advanced methodology for parameter estimation in Jump Markov Linear Systems (JMLS). This class of systems features stochastic switching of linear behavior, which makes traditional parameter estimation techniques challenging. The authors propose a deterministic approach based on the Expectation-Maximization (EM) algorithm that obviates the need for Monte Carlo methods. The primary advantage of this approach is its stability and efficiency in computational time, offering an alternative to particle filter-based methods.

Problem Formulation

JMLS are characterized by discrete and continuous states that switch according to a Markov chain. The task of estimating the parameters of such systems—ranging from transition matrices to noise covariances—involves handling complex probability distributions. Traditional maximum likelihood (ML) techniques face challenges due to the absence of closed-form solutions for likelihoods in these systems.

The variational EM algorithm developed in this paper is directed at efficiently solving these challenges. It bypasses the need for computing the full joint smoothed distribution, which is computationally prohibitive, by utilizing a two-filter smoother approach. The primary computational complexity arises from the need to handle the exponential growth of state hypotheses over time, which is mitigated through systematic reduction techniques.

Algorithm Description

System Transformation

One of the initial tasks in implementing this algorithm involves transforming the JMLS into a form suitable for applying the two-filter smoother method. This transformation aligns the noise processes and allows for the use of existing smoothing solutions. The proposed transformation involves redefining the input and noise correlation structures of the system equations.

Two-Filter Smoother

The core computational component is the two-filter smoother, which simultaneously operates forward and backward filtering processes to generate joint-smoothed state estimates. The challenge here lies in approximating the joint distribution without resorting to computationally intensive Monte Carlo sampling. The authors propose a deterministic merging approach, leveraging Kullback-Leibler Reduction (KLR) to manage the complexity of Gaussian components in the mixture model.

EM Algorithm

The EM algorithm is implemented through iterative maximization steps. The E-step involves computing the sufficient statistics using the two-filter smoother results. These statistics form the basis for the M-step, where closed-form solutions update the parameter estimates. Notably, the paper provides explicit formulations for updating each type of parameter—transition matrices, noise covariances, and model parameters—ensuring computational stability through numerically robust matrix operations.

Numerical Stability and Implementation

The paper emphasizes the importance of numerical stability in implementing the EM algorithm. Key operations, such as QR decompositions, are used to handle matrix inversions and ensure that calculations remain stable. The authors propose practical techniques for maintaining precision, such as log-sum-exp tricks and careful handling of matrix square roots.

The scalable nature of the algorithm makes it viable for systems with high state dimensions or long observation sequences, where traditional methods may falter. By allowing the trade-off between computational accuracy and complexity, the method is adaptable to various resource constraints.

Simulation and Results

The paper provides comprehensive simulations that benchmark the proposed algorithm against standard particle-filter-based approaches. These simulations are conducted on systems with varying complexity, including high-dimensional state spaces and long observation sequences. The results demonstrate significant improvements in both computational efficiency and parameter estimation accuracy.

In particular, the algorithm's convergence properties and robustness to initial parameter guesses are highlighted. The authors show that their approach reliably converges to reasonable solutions even from suboptimal starting points, a notable improvement over traditional methods that often require careful initialization.

Conclusion

The variational EM algorithm presented in this paper provides a steadfast framework for parameter estimation in JMLS. By avoiding computationally expensive Monte Carlo methods and focusing on deterministic approximations, the authors offer a solution that balances accuracy, computational cost, and implementation feasibility. This makes the method particularly valuable for practical applications in fields requiring robust, real-time system identification. Future work may explore further optimization and adaptation of the algorithm for broader classes of hybrid or nonlinear systems.