Papers
Topics
Authors
Recent
Search
2000 character limit reached

Joint Projected Normal & Skew-Normal (JPSN)

Updated 28 May 2026
  • JPSN is a flexible multivariate distribution that unifies circular data via the projected normal and linear data via the skew-normal frameworks.
  • It employs latent variable augmentation to facilitate Bayesian inference and efficiently capture complex interdependencies between angular and linear components.
  • Its effectiveness is demonstrated in animal movement studies, where it improves state detection and predictive accuracy compared to simpler independent models.

The Joint Projected Normal and Skew-Normal (JPSN) distribution is a class of flexible multivariate models for joint circular-linear or poly-cylindrical data. It combines the projected normal (PN) distribution for circular components and the skew-normal (SN) distribution for linear components, extending multivariate modeling in applications such as animal movement studies where angular (directional) and linear (e.g., speed or length) variables co-occur. The construction leverages latent variable augmentations to facilitate Bayesian inference and Gibbs sampling, addresses non-identifiability in the projected normal component via post-processing, and naturally supports quantification of arbitrary multivariate dependence structures (Mastrantonio, 2015, Mastrantonio, 2017).

1. Constructive Definition

Let pp denote the number of circular variables, Θ=(Θ1,…,Θp)⊤∈[0,2π)p\Theta = (\Theta_1, \ldots, \Theta_p)^\top \in [0,2\pi)^p, and qq the number of linear variables, Y=(Y1,…,Yq)⊤∈RqY = (Y_1, \ldots, Y_q)^\top \in \mathbb{R}^q. The latent variable construction introduces Wi=(Wi1,Wi2)⊤∈R2W_i = (W_{i1}, W_{i2})^\top \in \mathbb{R}^2 for each angular component, with Ri=∥Wi∥>0R_i = \|W_i\| > 0 and Θi=atan2(Wi2,Wi1)\Theta_i = \text{atan2}(W_{i2}, W_{i1}). The linear variables incorporate a skew-normal structure via a latent random effect d∼HNq(0,Iq)d \sim \mathcal{HN}_q(0, I_q), so that conditional on dd, Y∣d∼Nq(μy+Λd,Σy)Y \mid d \sim N_q(\mu_y + \Lambda d, \Sigma_y), where Θ=(Θ1,…,Θp)⊤∈[0,2π)p\Theta = (\Theta_1, \ldots, \Theta_p)^\top \in [0,2\pi)^p0.

Stacking Θ=(Θ1,…,Θp)⊤∈[0,2π)p\Theta = (\Theta_1, \ldots, \Theta_p)^\top \in [0,2\pi)^p1, the augmented vector Θ=(Θ1,…,Θp)⊤∈[0,2π)p\Theta = (\Theta_1, \ldots, \Theta_p)^\top \in [0,2\pi)^p2 is modeled jointly as: Θ=(Θ1,…,Θp)⊤∈[0,2π)p\Theta = (\Theta_1, \ldots, \Theta_p)^\top \in [0,2\pi)^p3 where

Θ=(Θ1,…,Θp)⊤∈[0,2π)p\Theta = (\Theta_1, \ldots, \Theta_p)^\top \in [0,2\pi)^p4

The full (augmented) likelihood is: Θ=(Θ1,…,Θp)⊤∈[0,2π)p\Theta = (\Theta_1, \ldots, \Theta_p)^\top \in [0,2\pi)^p5 where Θ=(Θ1,…,Θp)⊤∈[0,2π)p\Theta = (\Theta_1, \ldots, \Theta_p)^\top \in [0,2\pi)^p6 is determined by Θ=(Θ1,…,Θp)⊤∈[0,2π)p\Theta = (\Theta_1, \ldots, \Theta_p)^\top \in [0,2\pi)^p7. The marginal joint density Θ=(Θ1,…,Θp)⊤∈[0,2π)p\Theta = (\Theta_1, \ldots, \Theta_p)^\top \in [0,2\pi)^p8, which integrates out the latent Θ=(Θ1,…,Θp)⊤∈[0,2π)p\Theta = (\Theta_1, \ldots, \Theta_p)^\top \in [0,2\pi)^p9, is intractable; hence inference proceeds using the augmented representation (Mastrantonio, 2015, Mastrantonio, 2017).

2. Relation to Projected Normal and Skew-Normal Distributions

The JPSN construction unifies two well-established frameworks:

  • Projected Normal (PN): For qq0, the circular component qq1 arises by projection: qq2 for qq3. Marginalization over the radius recovers the PN density for angles (Mastrantonio, 2017).
  • Sahu-Sahu Skew-Normal (SSN): If qq4 and qq5, the linear component qq6 has a multivariate skew-normal distribution, with augmented density qq7.

The JPSN replaces the product of separate circular and linear densities for qq8 with a single qq9-variate normal, coupling circular and linear components via Y=(Y1,…,Yq)⊤∈RqY = (Y_1, \ldots, Y_q)^\top \in \mathbb{R}^q0, thus encoding full multivariate dependence (Mastrantonio, 2017).

3. Parameters, Identifiability, and Posterior Representation

The parameters consist of:

  • Y=(Y1,…,Yq)⊤∈RqY = (Y_1, \ldots, Y_q)^\top \in \mathbb{R}^q1 and Y=(Y1,…,Yq)⊤∈RqY = (Y_1, \ldots, Y_q)^\top \in \mathbb{R}^q2 for circular latent normals;
  • Y=(Y1,…,Yq)⊤∈RqY = (Y_1, \ldots, Y_q)^\top \in \mathbb{R}^q3 and Y=(Y1,…,Yq)⊤∈RqY = (Y_1, \ldots, Y_q)^\top \in \mathbb{R}^q4 for the linear component;
  • Y=(Y1,…,Yq)⊤∈RqY = (Y_1, \ldots, Y_q)^\top \in \mathbb{R}^q5 capturing block covariance;
  • Y=(Y1,…,Yq)⊤∈RqY = (Y_1, \ldots, Y_q)^\top \in \mathbb{R}^q6, the skew-normal parameter vector for linear part.

A structural non-identifiability arises from the PN: scaling Y=(Y1,…,Yq)⊤∈RqY = (Y_1, \ldots, Y_q)^\top \in \mathbb{R}^q7 by any Y=(Y1,…,Yq)⊤∈RqY = (Y_1, \ldots, Y_q)^\top \in \mathbb{R}^q8 leaves Y=(Y1,…,Yq)⊤∈RqY = (Y_1, \ldots, Y_q)^\top \in \mathbb{R}^q9 invariant. Identifiability is often enforced post hoc by setting Wi=(Wi1,Wi2)⊤∈R2W_i = (W_{i1}, W_{i2})^\top \in \mathbb{R}^20 for each Wi=(Wi1,Wi2)⊤∈R2W_i = (W_{i1}, W_{i2})^\top \in \mathbb{R}^21 (white noise constraint), so after sampling unconstrained Wi=(Wi1,Wi2)⊤∈R2W_i = (W_{i1}, W_{i2})^\top \in \mathbb{R}^22, draws are mapped to an identifiable space: Wi=(Wi1,Wi2)⊤∈R2W_i = (W_{i1}, W_{i2})^\top \in \mathbb{R}^23

Wi=(Wi1,Wi2)⊤∈R2W_i = (W_{i1}, W_{i2})^\top \in \mathbb{R}^24

All posterior summaries are then reported for Wi=(Wi1,Wi2)⊤∈R2W_i = (W_{i1}, W_{i2})^\top \in \mathbb{R}^25 (Mastrantonio, 2017, Mastrantonio, 2015).

4. Bayesian Inference and Augmented Gibbs Sampling

The Bayesian model employs conjugate priors: Wi=(Wi1,Wi2)⊤∈R2W_i = (W_{i1}, W_{i2})^\top \in \mathbb{R}^26 follows a normal-inverse-Wishart prior, and Wi=(Wi1,Wi2)⊤∈R2W_i = (W_{i1}, W_{i2})^\top \in \mathbb{R}^27 is Gaussian a priori. Inference leverages the tractability of the augmented framework, enabling Gibbs updates as if for standard multivariate normal models:

  • Given the latent radii Wi=(Wi1,Wi2)⊤∈R2W_i = (W_{i1}, W_{i2})^\top \in \mathbb{R}^28 and skew-normals Wi=(Wi1,Wi2)⊤∈R2W_i = (W_{i1}, W_{i2})^\top \in \mathbb{R}^29, updating Ri=∥Wi∥>0R_i = \|W_i\| > 00 reduces to standard normal-inverse-Wishart conjugate forms.
  • Full conditional for Ri=∥Wi∥>0R_i = \|W_i\| > 01 is Gaussian, due to its linear regression idiom in Ri=∥Wi∥>0R_i = \|W_i\| > 02 versus Ri=∥Wi∥>0R_i = \|W_i\| > 03.
  • Each Ri=∥Wi∥>0R_i = \|W_i\| > 04 is sampled from a truncated normal with mean and covariance determined by the linear block of Ri=∥Wi∥>0R_i = \|W_i\| > 05 and Ri=∥Wi∥>0R_i = \|W_i\| > 06.
  • Ri=∥Wi∥>0R_i = \|W_i\| > 07 is sampled by slice sampling or Metropolis-Hastings, exploiting the Ri=∥Wi∥>0R_i = \|W_i\| > 08-dimensional structure.

Posterior draws are mapped to the identifiable space. Convergence and mixing are evaluated via trace plots, effective sample size (ESS), and Geweke’s diagnostic (Mastrantonio, 2015, Mastrantonio, 2017).

5. Dependence, Closure, and Interpretability

The JPSN enjoys closure under marginalization: any subset of Ri=∥Wi∥>0R_i = \|W_i\| > 09 is again JPSN-distributed for corresponding parameter sub-blocks. This property is inherited from the stability of both the projected normal and skew-normal distributions under subsetting.

Multivariate dependence can be quantified as follows:

  • Circular–circular correlation (Jammalamadaka–Sarma):

Θi=atan2(Wi2,Wi1)\Theta_i = \text{atan2}(W_{i2}, W_{i1})0

where Θi=atan2(Wi2,Wi1)\Theta_i = \text{atan2}(W_{i2}, W_{i1})1 is an i.i.d. copy of Θi=atan2(Wi2,Wi1)\Theta_i = \text{atan2}(W_{i2}, W_{i1})2.

  • Circular–linear dependence (Mardia’s measure):

Θi=atan2(Wi2,Wi1)\Theta_i = \text{atan2}(W_{i2}, W_{i1})3

  • Linear–linear dependence is characterized by Pearson correlation.

This allows comprehensive interpretability of joint circular-linear structures (Mastrantonio, 2017).

6. Extensions to Time Series: Hidden Markov Models

Time-dependent extensions model sequences Θi=atan2(Wi2,Wi1)\Theta_i = \text{atan2}(W_{i2}, W_{i1})4 as emissions of a Θi=atan2(Wi2,Wi1)\Theta_i = \text{atan2}(W_{i2}, W_{i1})5-state hidden Markov model (HMM), where each state Θi=atan2(Wi2,Wi1)\Theta_i = \text{atan2}(W_{i2}, W_{i1})6 admits a state-specific JPSNΘi=atan2(Wi2,Wi1)\Theta_i = \text{atan2}(W_{i2}, W_{i1})7 emission law. The latent discrete state sequence Θi=atan2(Wi2,Wi1)\Theta_i = \text{atan2}(W_{i2}, W_{i1})8 evolves by a Markov chain with transition matrix Θi=atan2(Wi2,Wi1)\Theta_i = \text{atan2}(W_{i2}, W_{i1})9 (finite d∼HNq(0,Iq)d \sim \mathcal{HN}_q(0, I_q)0) or as a stick-breaking process (infinite-state sticky HDP-HMM; sHDP-HMM).

Inference employs beam sampling (slice sampling for the state sequence), combining the JPSN Gibbs steps for within-state parameters. This enables state-dependent modeling of heterogeneous and dependent circular and linear behavior in, e.g., animal movement data, supporting automatic (Bayesian nonparametric) inference for the effective number of states (Mastrantonio, 2015).

7. Applications and Empirical Performance

Applications include modeling movement of free-ranging Maremma Sheepdogs (six animals, d∼HNq(0,Iq)d \sim \mathcal{HN}_q(0, I_q)1 angles and d∼HNq(0,Iq)d \sim \mathcal{HN}_q(0, I_q)2 log-step lengths, d∼HNq(0,Iq)d \sim \mathcal{HN}_q(0, I_q)3 observations) with the sHDP–JPSN–HMM. The posterior mode for the number of behavioral states was three, corresponding to interpretable behaviors: (1) low-speed, tortuous motion (rest or livestock attending), (2) intermediate speed with moderate alignment, and (3) high-speed, nearly straight boundary patrol. The JPSN–HMM reveals intra- and inter-individual dependence in both turning and step-length, whereas simpler HMM models with independent marginals tend to over-segment the behavior and obscure social structure.

Out-of-sample predictive scores (Continuous Ranked Probability Score, CRPS) demonstrated that the JPSN–HMM outperformed alternative models using independent von Mises or wrapped Cauchy (for angles) and Gamma or Weibull (for lengths): mean circular CRPS d∼HNq(0,Iq)d \sim \mathcal{HN}_q(0, I_q)4 versus d∼HNq(0,Iq)d \sim \mathcal{HN}_q(0, I_q)5–d∼HNq(0,Iq)d \sim \mathcal{HN}_q(0, I_q)6, mean linear CRPS d∼HNq(0,Iq)d \sim \mathcal{HN}_q(0, I_q)7 versus d∼HNq(0,Iq)d \sim \mathcal{HN}_q(0, I_q)8–d∼HNq(0,Iq)d \sim \mathcal{HN}_q(0, I_q)9 for the alternatives (Mastrantonio, 2015).

A comparable analysis on zebra movement data (dd0, dd1, dd2) confirmed improved recovery of empirical dependence and predictive accuracy, with the JPSN attaining the lowest CRPS across held-out circular and linear segments (Mastrantonio, 2017).

8. Computational and Diagnostic Considerations

The principal computational bottleneck is the per-iteration cost: dd3 for normal-inverse-Wishart updates in HMM extensions, driven by the covariance structure. Metropolis steps for dd4 require tuning. The model’s structure allows for block-wise sparse or diagonal approximations in high dimension. Mixing and convergence are assessed via trace plots and ESS; identifiability is reliably enforced via post-processing (Mastrantonio, 2015, Mastrantonio, 2017).


Primary sources:

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Joint Projected Normal and Skew-Normal (JPSN).