Joint Projected Normal & Skew-Normal (JPSN)
- JPSN is a flexible multivariate distribution that unifies circular data via the projected normal and linear data via the skew-normal frameworks.
- It employs latent variable augmentation to facilitate Bayesian inference and efficiently capture complex interdependencies between angular and linear components.
- Its effectiveness is demonstrated in animal movement studies, where it improves state detection and predictive accuracy compared to simpler independent models.
The Joint Projected Normal and Skew-Normal (JPSN) distribution is a class of flexible multivariate models for joint circular-linear or poly-cylindrical data. It combines the projected normal (PN) distribution for circular components and the skew-normal (SN) distribution for linear components, extending multivariate modeling in applications such as animal movement studies where angular (directional) and linear (e.g., speed or length) variables co-occur. The construction leverages latent variable augmentations to facilitate Bayesian inference and Gibbs sampling, addresses non-identifiability in the projected normal component via post-processing, and naturally supports quantification of arbitrary multivariate dependence structures (Mastrantonio, 2015, Mastrantonio, 2017).
1. Constructive Definition
Let denote the number of circular variables, , and the number of linear variables, . The latent variable construction introduces for each angular component, with and . The linear variables incorporate a skew-normal structure via a latent random effect , so that conditional on , , where 0.
Stacking 1, the augmented vector 2 is modeled jointly as: 3 where
4
The full (augmented) likelihood is: 5 where 6 is determined by 7. The marginal joint density 8, which integrates out the latent 9, is intractable; hence inference proceeds using the augmented representation (Mastrantonio, 2015, Mastrantonio, 2017).
2. Relation to Projected Normal and Skew-Normal Distributions
The JPSN construction unifies two well-established frameworks:
- Projected Normal (PN): For 0, the circular component 1 arises by projection: 2 for 3. Marginalization over the radius recovers the PN density for angles (Mastrantonio, 2017).
- Sahu-Sahu Skew-Normal (SSN): If 4 and 5, the linear component 6 has a multivariate skew-normal distribution, with augmented density 7.
The JPSN replaces the product of separate circular and linear densities for 8 with a single 9-variate normal, coupling circular and linear components via 0, thus encoding full multivariate dependence (Mastrantonio, 2017).
3. Parameters, Identifiability, and Posterior Representation
The parameters consist of:
- 1 and 2 for circular latent normals;
- 3 and 4 for the linear component;
- 5 capturing block covariance;
- 6, the skew-normal parameter vector for linear part.
A structural non-identifiability arises from the PN: scaling 7 by any 8 leaves 9 invariant. Identifiability is often enforced post hoc by setting 0 for each 1 (white noise constraint), so after sampling unconstrained 2, draws are mapped to an identifiable space: 3
4
All posterior summaries are then reported for 5 (Mastrantonio, 2017, Mastrantonio, 2015).
4. Bayesian Inference and Augmented Gibbs Sampling
The Bayesian model employs conjugate priors: 6 follows a normal-inverse-Wishart prior, and 7 is Gaussian a priori. Inference leverages the tractability of the augmented framework, enabling Gibbs updates as if for standard multivariate normal models:
- Given the latent radii 8 and skew-normals 9, updating 0 reduces to standard normal-inverse-Wishart conjugate forms.
- Full conditional for 1 is Gaussian, due to its linear regression idiom in 2 versus 3.
- Each 4 is sampled from a truncated normal with mean and covariance determined by the linear block of 5 and 6.
- 7 is sampled by slice sampling or Metropolis-Hastings, exploiting the 8-dimensional structure.
Posterior draws are mapped to the identifiable space. Convergence and mixing are evaluated via trace plots, effective sample size (ESS), and Geweke’s diagnostic (Mastrantonio, 2015, Mastrantonio, 2017).
5. Dependence, Closure, and Interpretability
The JPSN enjoys closure under marginalization: any subset of 9 is again JPSN-distributed for corresponding parameter sub-blocks. This property is inherited from the stability of both the projected normal and skew-normal distributions under subsetting.
Multivariate dependence can be quantified as follows:
- Circular–circular correlation (Jammalamadaka–Sarma):
0
where 1 is an i.i.d. copy of 2.
- Circular–linear dependence (Mardia’s measure):
3
- Linear–linear dependence is characterized by Pearson correlation.
This allows comprehensive interpretability of joint circular-linear structures (Mastrantonio, 2017).
6. Extensions to Time Series: Hidden Markov Models
Time-dependent extensions model sequences 4 as emissions of a 5-state hidden Markov model (HMM), where each state 6 admits a state-specific JPSN7 emission law. The latent discrete state sequence 8 evolves by a Markov chain with transition matrix 9 (finite 0) or as a stick-breaking process (infinite-state sticky HDP-HMM; sHDP-HMM).
Inference employs beam sampling (slice sampling for the state sequence), combining the JPSN Gibbs steps for within-state parameters. This enables state-dependent modeling of heterogeneous and dependent circular and linear behavior in, e.g., animal movement data, supporting automatic (Bayesian nonparametric) inference for the effective number of states (Mastrantonio, 2015).
7. Applications and Empirical Performance
Applications include modeling movement of free-ranging Maremma Sheepdogs (six animals, 1 angles and 2 log-step lengths, 3 observations) with the sHDP–JPSN–HMM. The posterior mode for the number of behavioral states was three, corresponding to interpretable behaviors: (1) low-speed, tortuous motion (rest or livestock attending), (2) intermediate speed with moderate alignment, and (3) high-speed, nearly straight boundary patrol. The JPSN–HMM reveals intra- and inter-individual dependence in both turning and step-length, whereas simpler HMM models with independent marginals tend to over-segment the behavior and obscure social structure.
Out-of-sample predictive scores (Continuous Ranked Probability Score, CRPS) demonstrated that the JPSN–HMM outperformed alternative models using independent von Mises or wrapped Cauchy (for angles) and Gamma or Weibull (for lengths): mean circular CRPS 4 versus 5–6, mean linear CRPS 7 versus 8–9 for the alternatives (Mastrantonio, 2015).
A comparable analysis on zebra movement data (0, 1, 2) confirmed improved recovery of empirical dependence and predictive accuracy, with the JPSN attaining the lowest CRPS across held-out circular and linear segments (Mastrantonio, 2017).
8. Computational and Diagnostic Considerations
The principal computational bottleneck is the per-iteration cost: 3 for normal-inverse-Wishart updates in HMM extensions, driven by the covariance structure. Metropolis steps for 4 require tuning. The model’s structure allows for block-wise sparse or diagonal approximations in high dimension. Mixing and convergence are assessed via trace plots and ESS; identifiability is reliably enforced via post-processing (Mastrantonio, 2015, Mastrantonio, 2017).
Primary sources:
- (Mastrantonio, 2015) The Joint Projected and Skew Normal
- (Mastrantonio, 2017) The joint projected normal and skew-normal: a distribution for poly-cylindrical data