Bayesian inference for a covariance matrix (1408.4050v2)

Published 18 Aug 2014 in stat.ME

Abstract: Covariance matrix estimation arises in multivariate problems including multivariate normal sampling models and regression models where random effects are jointly modeled, e.g. random-intercept, random-slope models. A Bayesian analysis of these problems requires a prior on the covariance matrix. Here we assess, through a simulation study and a real data set, the impact this prior choice has on posterior inference of the covariance matrix. Inverse Wishart distribution is the natural choice for a covariance matrix prior because its conjugacy on normal model and simplicity, is usually available in Bayesian statistical software. However inverse Wishart distribution presents some undesirable properties from a modeling point of view. It can be too restrictive because assume the same amount of prior information about every variance parameters and, more important, it shows a prior relationship between the variances and correlations. Some alternatives distributions has been proposed. The scaled inverse Wishart distribution, which give more flexibility on the variance priors conserving the conjugacy property but does not eliminate the prior relationship between variances and correlations. Secondly, it is possible to fit separate priors for individual correlations and standard deviations. This strategy eliminates any prior relationship within the covariance matrix parameters, but it is not conjugate and therefore computationally slow.

Citations (176)

View on Semantic Scholar

Summary

The paper's main contribution shows that prior choices critically influence Bayesian covariance matrix estimation, with the inverse Wishart introducing notable biases.
It contrasts several methodologies, including scaled and hierarchical priors as well as a separation strategy, to enhance flexibility in modeling variance and correlation.
Simulation studies and practical analysis demonstrate actionable insights for optimal prior selection, emphasizing computational trade-offs and potential improvements with Hamiltonian Monte Carlo.

Bayesian Inference for a Covariance Matrix

The paper by Ignacio Alvarez, Jarad Niemi, and Matt Simpson from Iowa State University addresses the complexities of covariance matrix estimation within Bayesian frameworks. Covariance matrices are central to multivariate analyses, impacting models such as multivariate normal sampling and random-effect regression. The principal focus of this paper is to assess various prior distributions for covariance matrices, which significantly influence posterior inferences. The research contrasts several priors—namely, the inverse Wishart, scaled inverse Wishart, hierarchical inverse Wishart, and a separation strategy—highlighting their implications and potential biases in Bayesian analysis.

Prior Distributions Explored

Inverse Wishart (IW) Prior: Commonly used due to its conjugacy, the IW prior is criticized for its lack of flexibility. The degree of freedom parameter emasculates its ability to address varied uncertainty across different variance components. Moreover, its marginal distribution underestimates near-zero variances and imposes a correlation dependence with variances, potentially biasing posterior estimates (especially towards zero correlations when variances are underestimated).
Scaled Inverse Wishart (SIW) Prior: This approach adds parameters to the IW prior for enhanced flexibility, particularly allowing individual variance components to be modeled via log-normal distributions. Despite this improvement, correlations are modeled similarly to IW, maintaining uniform distribution features.
Hierarchical Half-t (HIW $_{ht}$ ) Prior: This recent proposal incorporates half-t distributed standard deviations, which are advantageous to overcome the zero-bound problem seen with IW. The correlation distribution remains similar, permitting flexibility in standard deviations' prior choice.
Separation Strategy (SS): By decomposing the covariance matrix into independent components for variance and correlation, SS offers notable modeling adaptability, detaching correlation estimation from variances. This approach, though computationally intensive, provides unaffected correlation estimations with any variance or correlation values.

Results from Simulations and Practical Analysis

The paper presents simulation studies highlighting the impact of these priors on posterior inference—detailing biases inherent to certain prior choices. Notably, when employing an IW prior under circumstances with small empirical variances, posterior estimations can be skewed towards larger variances and zero correlations. This bias is potentially mitigated under larger sample sizes but remains prominent in conditions of small-scale variances.

In practical applications involving bird count data from the Superior National Forest, a real-world correlation estimation underscored the degree to which prior choice affects results. The paper confirmed that choosing total counts over mean counts mitigates IW prior biases, promoting effective variance and correlation recovery.

Implications and Future Directions

The implications of this research underscore the importance of prior choice in Bayesian covariance matrix estimation. Each prior comes with computational and theoretical trade-offs, emphasizing the need for careful consideration based on the specifics of the dataset and desired computational efficiency.

For future developments, leveraging Hamiltonian Monte Carlo methods, such as implemented in Stan, may relax some computational constraints traditionally associated with non-conjugate priors like the SS. Exploring new priors or adapting current ones could provide better bias control, particularly in scenarios where variance scaling is troublesome with existing methodologies.

The insights from this paper contribute to enhancing Bayesian estimation techniques, promoting more precise and unbiased inference approaches in multivariate statistical modeling.