Nonparametric Hilbert-Space Embedding HQMMs
- Nonparametric Hilbert-space-embedding HQMMs are sequence models that combine quantum graphical models with kernel-based embeddings to capture complex stochastic dynamics.
- They employ kernel formulations of the sum and Bayes rules, using spectral learning to achieve efficient moment matching without explicit likelihood maximization.
- Empirical evaluations show that HSE-HQMMs match or outperform LSTMs and predictive-state models on diverse datasets while maintaining full nonparametric density evolution.
Nonparametric Hilbert-space-embedding Hidden Quantum Markov Models (HSE-HQMMs) are a class of sequence models that unite quantum graphical models (QGMs) and Hilbert space embeddings (HSEs) to provide a nonparametric framework for modeling stochastic dynamics. Unlike classical HMMs, which utilize probability vectors to encode state uncertainty, HSE-HQMMs encode state as probability distributions over embedded density matrices in a reproducing kernel Hilbert space (RKHS). This construction enables the learning and filtering of complex, multimodal distributions over continuous features using kernel-based Bayesian inference, realized through sum and Bayes rules interpreted as kernel operations. HSE-HQMMs implement these updates using spectral learning algorithms, bypassing the need for explicit likelihood maximization or constrained optimization of large unitary operators, as in fully parametric HQMMs. Empirical evaluation demonstrates that HSE-HQMMs are competitive with state-of-the-art recurrent neural architectures and kernel predictive-state models on challenging real-world datasets, while maintaining full nonparametric densities across time (Srinivasan et al., 2018).
1. Quantum Graphical Models and Hilbert Space Embeddings
Quantum Graphical Models (QGMs) generalize classical graphical models by representing uncertainty using density operators , which are positive semidefinite, trace-one matrices acting on a complex Hilbert space . Classical HMMs encode the hidden state by a probability vector , updated via linear transitions and emission likelihoods. In QGMs, transitions and observations are implemented by unitary maps and partial-trace or projection operations on appropriate tensor-product spaces.
Hilbert space embeddings (HSEs) further generalize Bayesian inference by mapping probability distributions into RKHS via feature maps . For HSE-HQMMs, the feature map is specifically chosen so that yields the vectorization of a rank-1 density matrix. The expected state is hence also a valid (vectorized) density matrix. Relationships between random variables, such as joint distributions, become cross-covariance operators .
2. Kernel Formulation of Sum Rule and Bayes Rule
The sum rule in QGMs propagates state across time steps via transition operations. In the Hilbert space embedding framework, this translates to the kernel sum rule. If the quantum sum rule is , then, upon embedding, is the regression operator estimated by
0
with updates 1.
Empirically, with sample-based Gram matrices 2, the kernel sum rule becomes
3
where 4 and 5 are matrices of feature embeddings for samples 6, respectively.
The Bayes rule is realized within this framework as Nadaraya–Watson kernel regression. Given embeddings, the posterior update for hidden states conditioned on evidence 7 is
8
Alternatively, the standard (but more costly) kernel Bayes rule is
9
with increased computational expense due to the large Gram matrices involved.
3. Structure and Filtering in HSE-HQMMs
In HSE-HQMMs, hidden states evolve via kernelized analogs of quantum circuit operations. The state at each step is a density embedding, updated by sequential application of sum and Bayes rules in kernel space.
Filtering proceeds as follows:
- The prediction step uses the sum rule twice via cross-covariance embeddings:
0
- The update step uses the Nadaraya–Watson rule:
1
In Gram-matrix form, prediction and correction involve sequential matrix operations with kernel matrices and normalization by trace.
4. Spectral Learning via Two-Stage Regression
Parameter estimation in HSE-HQMMs employs the two-stage regression (2SR) algorithm, which enables moment-matching-based estimation without likelihood maximization:
- Given an observed sequence 2, construct feature representations for "history" 3, "future" 4, and "shifted-future" 5 via random Fourier features.
- In stage 1, regress future and extended future+observation features on history:
6
7
- Compute denoised predictive state representations via these regressed operators.
- In stage 2, regress the extended future+obs on the denoised future:
8
This yields the necessary embedding operator tensor for HSE-HQMM inference and filtering (Srinivasan et al., 2018).
5. Computational Complexity and Scalability
HSE-HQMM filtering updates require operations involving 9 with computational complexity as follows:
- General RKHS: 0 precomputation, 1 per time step; storage 2.
- With random Fourier features: 3 precompute, 4 per step; storage 5 or 6, where 7.
- Classical HMMs: store 8 transition and 9 emission parameters, 0 updates per step.
- Parametric HQMMs: learning requires constrained optimization of unitaries with 1 parameters.
A plausible implication is that the nonparametric HSE-HQMM accommodates increasing data size flexibly by adjusting RKHS (or RFF) dimensionality, providing a practical tradeoff between computation and approximation quality.
6. Empirical Performance and Applications
HSE-HQMMs have been evaluated on the Penn Treebank (character-level language modeling, with perplexity metric), OpenAI Gym Swimmer (5D continuous state, 10-step MSE), and Human Mocap (22D continuous state, 10-step MSE). Baselines include LSTM recurrent neural networks and PSRNNs (kernelized Predictive-State RNNs). HSE-HQMMs demonstrate comparable or superior performance: on the Swimmer dataset, they substantially outperform the baselines; on the Human Mocap and PTB datasets, they match the best performance.
A distinctive advantage of HSE-HQMMs is their maintenance of a nonparametric density over continuous observations through time, as illustrated by the evolution of marginal densities (see Figure 1 in (Srinivasan et al., 2018)). This allows the model to represent multimodal uncertainty, which models such as LSTMs and parametric HQMMs inherently cannot capture.
| Dataset | Metric | HSE-HQMM Result | Baseline Comparison |
|---|---|---|---|
| Penn Treebank | 1-step perplexity | Comparable to best | Matches LSTM, PSRNN |
| Swimmer | 10-step MSE | Substantially better | Outperforms baselines |
| Human Mocap | 10-step MSE | Matches best | Matches LSTM, PSRNN |
7. Summary and Context
HSE-HQMMs extend the descriptive power of HMMs and HQMMs to nonparametric domains, providing principled learning and inference of distributions over continuous-valued features. They are constructed via kernelized Bayesian operations in RKHS, learned efficiently via two-stage spectral regression, and support practical inference with scalable complexity using random feature approximations. HSE-HQMMs offer a unique capability to track full nonparametric probability densities throughout time-series evolution, positioning them as a compelling model for sequential phenomena with complex, multimodal structure (Srinivasan et al., 2018).