Hidden Markov Models with Circular-Linear Observations

Updated 28 May 2026

The paper presents advanced modeling using regime-specific joint distributions that capture the dependence between circular variables (e.g., angles) and linear variables (e.g., wind speed).
Methodologies such as the General Projected Normal, Joint Projected Skew-Normal, and Riemannian product models are employed to address diverse data characteristics and dependence structures.
Bayesian inference techniques, including Gibbs sampling with latent variable augmentation and FFBS, facilitate efficient estimation of hidden states and robust model diagnostics.

Hidden Markov Models (HMMs) with circular-linear observations provide a probabilistically rigorous approach to modeling time series data where each observation comprises both a circular component (typically an angle on the unit circle, such as wind direction) and a linear component (a real-valued or count-valued quantity, such as wind speed). This modeling framework accommodates serial dependence via latent regimes and captures complex dependence structures within and between circular and linear variables, as well as between observations across time.

1. Model Specification and Families

The defining feature of HMMs for circular-linear data is their ability to encode Markov switching between heterogeneous joint distributions on the product space $\mathcal{M} = \mathbb{S}^1 \times \mathbb{R}$ or its discrete/count analogs. The latent state process $\{Z_t\}$ is a first-order (often time-homogeneous) Markov chain over $K$ regimes. Conditional on $Z_t=k$ , the emission law for $(\Theta_t, X_t)$ is a regime-specific circular-linear joint distribution.

Several families exist:

General Projected Normal Emissions: Each regime $k$ induces a bivariate normal latent variable $\mathbf{U}_t \sim N_2(\boldsymbol{m}_k, \Sigma_k)$ , projected onto polar coordinates $(R_t,\Theta_t)$ , with $X_t$ governed by a linear regression on $(R_t\cos\Theta_t, R_t\sin\Theta_t)$ plus Gaussian noise. This structure allows for bimodal or skewed circular marginals and nontrivial circular-linear correlation (Mastrantonio et al., 2014).
Joint Projected and Skew-Normal (JPSN): The emission is a jointly specified distribution on $\{Z_t\}$ 0 constructed from a higher-dimensional skew-normal variable. This allows for arbitrary (co-)variance amongst circular and linear components, including skewness in the linear marginal (Mastrantonio, 2015).
Riemannian Product Model: For $\{Z_t\}$ 1 lying on $\{Z_t\}$ 2, classical choices use independent regime-specific von Mises (or Riemannian-Gaussian) for the angle and Gaussian for the linear variable. Here, density factorization is along the product manifold (Said et al., 2021).
Discrete/Count Models with Invariant Wrapped Poisson: For integer-valued $\{Z_t\}$ 3 and discretized $\{Z_t\}$ 4, emissions may include hurdle Poisson for the linear and invariant wrapped Poisson (IWP) for the circular part, embedding informative missingness (Mastrantonio et al., 2017).

The structure of each emission law critically determines the types of marginal and joint behaviors (unimodality, multimodality, independence, asymmetry, and marginal support) permitted within each regime.

2. Circular-Linear Joint Distributions in HMM Emissions

Analytical tractability and inferential flexibility for HMMs with circular-linear observations depend on the choice of regime-specific joint distributions.

General Projected Normal Model: The latent bivariate Gaussian $\{Z_t\}$ 5 is transformed to the radius-angle pair $\{Z_t\}$ 6 via $\{Z_t\}$ 7, $\{Z_t\}$ 8, with $\{Z_t\}$ 9, $K$ 0. The Jacobian yields a marginal density for $K$ 1 that can be symmetric or bimodal depending on $K$ 2 and $K$ 3, while regression of $K$ 4 on $K$ 5 allows for state-dependent circular-linear correlation (Mastrantonio et al., 2014).
JPSN Model: The $K$ 6-dimensional Sahu-Dey-Branco skew-normal is partitioned into $K$ 7 (which projects to circular components) and $K$ 8 (linear). Circular-linear covariance enters via the off-diagonal blocks of $K$ 9; skewness in $Z_t=k$ 0 is controlled by a latent half-normal auxiliary variable. The induced marginal for $Z_t=k$ 1 is always a projected normal, for $Z_t=k$ 2 a skew-normal, and correlations among all trigonometric and linear terms are regime-dependent (Mastrantonio, 2015).
Conditional Independence Models: Simpler models, such as regime-specific von Mises and Gaussian, assume conditional independence of $Z_t=k$ 3 and $Z_t=k$ 4 given $Z_t=k$ 5, leading to computational tractability but failing to capture empirically observed dependence (Said et al., 2021).
Discrete Circular-Linear Models: Emissions for count/time series with discrete support may combine hurdle and wrapped distributions, accommodating features like zero-inflation and missingness dependent on the linear variable (Mastrantonio et al., 2017).

A key methodological advance is relaxation of circular-linear independence at the regime level, as this assumption hinders interpretability and fit in practical domains where the circular component often modulates the linear one (Mastrantonio et al., 2014).

3. Bayesian Inference and Latent Variable Methods

Inference for these models generally follows a Bayesian paradigm utilizing Gibbs sampling or Metropolis-within-Gibbs for latent and model parameters.

Latent Radii Augmentation: For projected normal or JPSN models, introducing the latent $Z_t=k$ 6 (and possible skewness auxiliary variables, as in JPSN) converts awkward marginal likelihoods over $Z_t=k$ 7 into standard conditional distributions over latent Gaussian (plus half-normal) variables, facilitating closed-form conjugate updates for covariance, mean, and regression parameters (Mastrantonio et al., 2014, Mastrantonio, 2015).
Forward-Filtering/Backward-Sampling: State sequences $Z_t=k$ 8 are sampled with standard FFBS given current parameters and latent variables, leveraging the explicit emission density with the latent variables imputed (Mastrantonio et al., 2014, Mastrantonio, 2015).
Conjugate Updates: Parameters such as the transition matrix $Z_t=k$ 9 and initial state vector $(\Theta_t, X_t)$ 0 are sampled from Dirichlet posteriors. Regression and covariance parameters adopt normal-inverse-Wishart or similar conjugate priors (Mastrantonio et al., 2014, Mastrantonio, 2015).
Handling Label-Switching: Imposing identifiability constraints (e.g., order the circular means or linear component) or pivotal reordering post-processing addresses label-switching inherent to mixture/HMM posteriors (Mastrantonio et al., 2014, Mastrantonio, 2015).
Missing and Censored Data: Auxiliary imputation steps integrate unobserved or interval-censored observations within the Gibbs cycle, ensuring the full-conditional structure is preserved (Mastrantonio et al., 2017).
Posterior Predictive Assessment: Out-of-sample log-predictive density, WAIC, DIC, CRPS, and posterior summaries for state occupation are used for model comparison and selection (Mastrantonio et al., 2014, Mastrantonio, 2015).

4. Computational Methods and Software Considerations

Computational efficiency is enhanced by vectorizing sufficient-statistic calculations and parallelizing updates for auxiliary variables. Custom code in C++ (with RcppArmadillo) or Python (NumPy/Cython) is recommended due to the need for latent variable sampling, state sequence imputation, and handling of multi-modal posteriors. Existing probabilistic programming platforms may handle some components (e.g., hidden states in JAGS/BUGS), but due to discrete latent variables or nonparametric Dirichlet process priors, custom or specialized packages (e.g., ‘hdphmm’ or ‘nimble’) are often needed (Mastrantonio, 2015). Stan is not appropriate for discrete latent sequences.

Initialization via $(\Theta_t, X_t)$ 1-means on transformed trigonometric and linear data, coupled with effective convergence diagnostics (trace-plots, $(\Theta_t, X_t)$ 2, effective sample sizes), is essential for robust posterior inference. Label-switching resolution and multimodality in the posterior demand careful identifiability handling (Mastrantonio, 2015, Mastrantonio et al., 2014).

5. Application Domains and Model Interpretation

Circular-linear HMMs have been applied to environmental data (wind direction and speed), animal movement (direction and velocity), and other domains where such mixed-support time series arise.

Wind Data: Time series of wind direction (circular) and wind speed (log-linear) serve as canonical testbeds for these models. State-specific circular means, concentrations, linear means, variances, and correlation coefficients provide interpretable summaries corresponding to meteorologically meaningful regimes (e.g., Maestral, Sirocco, Bora) (Mastrantonio et al., 2014, Mastrantonio et al., 2017).
Animal Movement: The JPSN model captures idiosyncratic dependence (direction, speed, skewness) in multi-animal or multi-segment data, yielding fewer occupied states and better out-of-sample predictive performance relative to independent von Mises/Gamma or von Mises/Weibull HMMs (Mastrantonio, 2015).

Interpretation of posterior regime summaries focuses on the role of circular-linear correlation and the cluster-specific behavior of trigonometric components. Skewness effects in linear variables, regime-specific occupancy, and transition persistence are frequently analyzed.

6. Extensions and Alternative Methodologies

Several axes of generalization have been pursued in recent work:

Nonparametric HMMs: Sticky HDP-HMMs handle an unknown, potentially infinite, number of regimes via Bayesian nonparametrics and stick-breaking priors. The beam sampler enables efficient state sequence updates in the nonparametric setting (Mastrantonio et al., 2017, Mastrantonio, 2015).
Models on Riemannian Manifolds: The generalization of HMMs to product manifolds (e.g., $(\Theta_t, X_t)$ 3), with manifold-adapted EM algorithms and explicit Fréchet mean, exponential/logarithm maps, and wrapped densities, extends applicability to data types beyond Euclidean spaces (Said et al., 2021).
Discrete and Zero-inflated Observations: Hurdle and wrapped Poisson distributions address zero-inflation and instrument-induced missingness in count/circular data, especially relevant in finely gridded wind data and similar applications (Mastrantonio et al., 2017).
Regime-dependent Linear/Circular Dependence: Relaxing the independence assumption between circular and linear variables at the state level is critical for fit and interpretability; models with only independent emissions are of limited practical use (Mastrantonio et al., 2014).

Analyses have shown empirical superiority of joint emission approaches (e.g., JPSN, projected normal regression) in both predictive performance and parsimony (fewer distinct regimes required) compared to HMMs with independent or marginal emission laws (Mastrantonio, 2015).

7. Model Evaluation and Practical Considerations

Predictive metrics such as log-predictive density (LPD) and continuous ranked probability score (CRPS) for circular and linear components are standard for model comparison. WAIC and DIC, computed on posterior deviance draws, support model selection. Out-of-sample or leave-some-out validation protocols are routine. Posterior visualization includes marginal densities of angles (interpret unimodality/bimodality), marginal distributions of linear components (assess skewness/heavy tails), and inspection of regime-specific correlation structures for biological or physical interpretability (Mastrantonio et al., 2014, Mastrantonio, 2015).

Convergence, effective sample size, and label-switching resolution are primary diagnostics for robust Bayesian learning. Cross-model comparisons should align complexity of emission families with data requirements: more complex (nonparametric, joint, skewed) models generally outperform their simpler, independence-assuming, or marginal alternatives in realistic settings (Mastrantonio, 2015, Mastrantonio et al., 2017).

References:

Bayesian Hidden Markov Modelling Using Circular-Linear General Projected Normal Distribution (Mastrantonio et al., 2014)
The Joint Projected and Skew Normal (Mastrantonio, 2015)
Hidden Markov model for discrete circular-linear wind data time series (Mastrantonio et al., 2017)
Hidden Markov chains and fields with observations in Riemannian manifolds (Said et al., 2021)

Markdown Report Issue Upgrade to Chat

References (4)

Bayesian Hidden Markov Modelling Using Circular-Linear General Projected Normal Distribution (2014)

The Joint Projected and Skew Normal (2015)

Hidden Markov chains and fields with observations in Riemannian manifolds (2021)

Hidden Markov model for discrete circular-linear wind data time series (2017)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Hidden Markov Models with Circular-Linear Observations.

Hidden Markov Models with Circular-Linear Observations

1. Model Specification and Families

2. Circular-Linear Joint Distributions in HMM Emissions

3. Bayesian Inference and Latent Variable Methods

4. Computational Methods and Software Considerations

5. Application Domains and Model Interpretation

6. Extensions and Alternative Methodologies

7. Model Evaluation and Practical Considerations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Hidden Markov Models with Circular-Linear Observations

1. Model Specification and Families

2. Circular-Linear Joint Distributions in HMM Emissions

3. Bayesian Inference and Latent Variable Methods

4. Computational Methods and Software Considerations

5. Application Domains and Model Interpretation

6. Extensions and Alternative Methodologies

7. Model Evaluation and Practical Considerations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research