Quantum Hidden Markov Models

Updated 2 August 2025

Quantum Hidden Markov Models are quantum extensions of classical HMMs that replace probability vectors with density matrices and use CPTP maps to incorporate quantum coherence.
They enable superior memory compression by leveraging non-orthogonal quantum states and Kraus operators, which reduce the number of required states compared to classical models.
Efficient learning and inference algorithms, including retraction-based optimizations and tensor network representations, facilitate practical applications in quantum information and machine learning.

Quantum Hidden Markov Models (QHMMs) constitute a quantum-theoretic generalization of classical hidden Markov models (HMMs), in which the classical state space, transition and emission processes are replaced by quantum states and quantum channels. QHMMs are formulated in terms of completely positive trace-preserving (CPTP) maps—quantum channels—acting on density operators, with possible emission or measurement operations described through Kraus decompositions, transition operation matrices (TOMs), or more general quantum instruments. The QHMM and its variants, often called Hidden Quantum Markov Models (HQMMs), encapsulate the hidden (unobserved) memory of a stochastic process within a quantum system that interacts with an environment to produce a classical output sequence, inheriting and extending the probabilistic structure of the classical model with quantum superposition, interference, and resource-theoretic aspects of coherence.

1. Mathematical Structure and Formalism

The mathematical backbone of QHMMs is the replacement of classical probability vectors and stochastic matrices with quantum states (density matrices) and CPTP maps. The system state at time $t$ is given by a density matrix $\rho_t \in \mathcal{D}(\mathcal{H})$ , where $\mathcal{H}$ is the Hilbert space of interest. Transitions and emissions are described by sets of Kraus operators $\{K_{y,w}\}$ for each possible output symbol $y$ , satisfying the completeness relation

$\sum_{y,w} K_{y,w}^\dagger K_{y,w} = I.$

The state update upon emitting $y_t$ is

$\rho_t = \frac{\sum_{w} K_{y_t, w} \rho_{t-1} K_{y_t, w}^\dagger}{\mathrm{tr}\left(\sum_{w} K_{y_t, w} \rho_{t-1} K_{y_t, w}^\dagger\right)}.$

The likelihood of a sequence $y_1,\ldots,y_T$ is given by iteratively applying the Kraus operators and tracing: $P(y_1\ldots y_T) = \mathrm{tr}\left[ \sum_{w_T} K_{y_T, w_T} \cdots \left(\sum_{w_1} K_{y_1, w_1} \rho_0 K_{y_1, w_1}^\dagger \right) \cdots K_{y_T, w_T}^\dagger \right].$ Alternately, QHMMs can be defined with Transition Operation Matrices (TOMs) as in (Cholewa et al., 2015), which are matrices whose entries are completely positive maps and whose columns sum to quantum channels.

Mechanistically, QHMM transitions simulate observed system–memory interactions, where the observed system (the "channel output") interacts with an unobserved quantum memory—paralleling the latent variable structure of classical HMMs but generalized to allow for quantum coherences. When the quantum model is restricted to classical (diagonal) channels, the QHMM reduces exactly to a classical HMM (Cholewa et al., 2015).

2. Expressiveness, Resource Efficiency, and Spectral Bounds

QHMMs are strictly more expressive than classical HMMs and general operator models. The richer state space (mixed states in the spectraplex of density operators), the ability to use quantum interference, and the flexibility to assign multiple Kraus operators per output symbol, enable QHMMs to represent process languages that classical HMMs cannot (Adhikary et al., 2019).

The minimal quantum memory dimensions required to generate a process are determined by the spectral invariants of the transfer operator. For a given process, the set of distinct nonzero eigenvalues $\Lambda_X$ is invariant across all models producing it (Zonnios et al., 17 Dec 2024). The minimal quantum memory dimension $n$ is strictly bounded as

$n \geq \lceil |\Lambda_X|^{1/4} \rceil,$

whereas a classical model restricted to strictly incoherent operations faces a quadratic gap,

$n_\text{classical} \geq \lceil |\Lambda_X|^{1/2} \rceil.$

This quadratic gap is rooted in the resource theory of coherence; only models exploiting quantum coherence (i.e., non-diagonal Kraus maps) can compress the required memory below the classical bound (Zonnios et al., 17 Dec 2024). Explicit examples demonstrate processes for which a quantum model requires strictly fewer memory states than any classical model—a three-state HMM can be realized as a two-dimensional QHMM when non-orthogonal (coherent) memory states are employed (Zonnios et al., 17 Dec 2024).

Tensor-network representations make these memory savings operational. Non-negative tensor-train decompositions (MPS) correspond exactly to classical HMMs, while complex Born machines or locally purified states (LPS) provide strictly more expressive power with reduced rank/resource requirements (Glasser et al., 2019).

3. Learning, Inference, and Algorithms

QHMMs admit both direct maximum likelihood learning and hybrid quantum-classical algorithms for parameter estimation (Srinivasan et al., 2017, Adhikary et al., 2019, Markov et al., 2022). The primary learning task is to optimize the parameters of the channel—usually the Kraus operators or their Stinespring dilation (unitary embedding)—to best fit observed data.

The constraint $\sum_{y,w} K_{y,w}^\dagger K_{y,w} = I$ induces a geometry wherein parameters reside on the (complex) Stiefel manifold. Retract-and-project or geodesic update algorithms, such as the Wen–Yin retraction algorithm,

$\gamma(\tau) = \kappa_0 - \tau U (I + \frac{\tau}{2} V^\dagger U)^{-1} V^\dagger \kappa_0,$

where $U=[G|\kappa_0]$ and $V=[\kappa_0|-G]$ and $G$ is the gradient, provide efficient and convergent schemes for constrained optimization (Adhikary et al., 2019). Empirically, such retraction-based algorithms are orders of magnitude faster and more scalable than those relying on local Givens rotations (Adhikary et al., 2019).

Learning can also be formulated as the optimization of unitary circuits implemented via Stinespring's theorem, with mid-circuit measurements yielding observation symbols (Markov et al., 2022). This approach provides both a theoretical guarantee (every QHMM can be represented and learned through quantum circuits) and practical advantages (smoothness of the parameter landscape for search).

In the context of inference, Bayesian filtering and smoothing algorithms can be adapted to quantum latent paths, including analogues of forward–backward smoothing. Nonparametric approaches such as Hilbert space embedding of HQMMs (HSE-HQMMs) deploy kernelized updates (kernel sum rule, Nadaraya–Watson regression) over vectorized density matrices for learning in the presence of continuous features and rich observation spaces (Srinivasan et al., 2018).

4. Physical and Operational Connections

QHMMs provide an operator-theoretic foundation for modeling both open quantum systems and classical stochastic processes. Open quantum systems described by Lindblad master equations (with or without feedback) naturally realize HQMMs in discrete time via Kraus operator decompositions (Clark et al., 2014). Systems with instantaneous feedback—where the bath measurement triggers a rapid unitary operation—realize sophisticated HQMMs that can generate output distributions unattainable by classical models of equal size.

Generalizations to circular or ergodic temporal domains are available via tensor network models such as circular locally purified states (c-LPS), which, when constrained appropriately, are exactly equivalent to cyclic HQMMs (Javidian et al., 2021). This representation enables efficient learning in settings where data exhibit periodicity or where the Markov chain should be regarded as cyclic.

Physical implementations of HQMMs have been connected to experimentally accessible systems. The split HQMM (SHQMM) generalizes the basic model by partitioning the quantum memory into conditional subspaces to better match quantum transport systems (e.g., electrons tunneling through quantum dots), with evolution described by conditional master equations (Li et al., 2023).

5. Applications and Impact

QHMMs and their generalizations impact a wide spectrum of domains:

Quantum Information Theory: They afford new perspectives on quantum error correction, where the decodability threshold maps onto a learnability transition in qHMM inference (Kim et al., 11 Apr 2025), and also clarify resource requirements for quantum simulation of classical processes.
Machine Learning and Inference: HQMMs have been employed for sequence modeling with superior description accuracy and reduced latent dimension, for instance in probabilistic safety assessment and failure scenario generation (Zaiou et al., 2022), and for learning interpretable patterns in fluorescence data from quantum dots (Yang et al., 2 Jan 2025). Nonparametric HSE-HQMMs outperform or match LSTM and PSRNN baselines in both synthetic and real data sequencing tasks, capturing full predictive distributions (Srinivasan et al., 2018).
Physics and Biological Systems: The HQM framework provides a quantum-native formalization of processes in ion channel kinetics and neurological fluctuations, with energy-modulated channels yielding autocovariances and spectral properties (such as $1/f^\alpha$ noise) empirically observed in nature (Paris et al., 2015).

In financial engineering, QHMMs provide a quadratic reduction in hidden memory requirements for stochastic volatility models, offering tighter non-asymptotic bounds for maximum likelihood estimation and filtering compared to classical HMMs, and supporting scalable quantum-inspired algorithms (Ghysels et al., 28 Jul 2025).

6. Fundamental Limits, Coherence, and Future Directions

Theoretical results emphasize the role of quantum coherence as a resource for memory compression. Only QHMMs that utilize non-orthogonal (coherent) memory states can achieve the strict reduction in memory requirements signaled by the spectral invariant bounds (Zonnios et al., 17 Dec 2024). The resource theory of coherence supplies the precise operational meaning for this advantage: strictly incoherent (SIO) (i.e., “classical”) implementations cannot surpass the classical memory lower bound.

Future research directions include systematic exploration of the trade-off between memory compression and thermal efficiency in quantum stochastic simulation, extension of QHMMs to input–output and continuous-time models, experimental realization of quantum memory advantages, and integration of QHMM representations in learning architectures for quantum machine learning, control, and thermodynamics (Elliott, 2021, Markov et al., 2022, Li et al., 2023).

7. Connections to Classical Theory and Decision Problems

Quantum Markov models admit direct generalizations of classical Markov decision processes and hidden Markov models, with new decidability properties emerging due to quantization. While certain planning or goal-reachability problems become undecidable in the quantum setting, approximate optimal policy computation with discounted reward remains efficiently computable, illustrating a rare point of tractability for quantum planning (Tamon et al., 2019). The formal equivalence of quantum Moore and Mealy machines further provides structural flexibility for analysis and design of quantum-controlled processes.

In summary, Quantum Hidden Markov Models establish a rigorous and flexible bridge connecting quantum information, stochastic modeling, operator theory, and statistical inference, bringing quantum resources to bear on modeling, simulating, and learning complex processes with significant theoretical and practical advantages over classical paradigms.