Entropic covariance models (2306.03590v3)
Abstract: In covariance matrix estimation, one of the challenges lies in finding a suitable model and an efficient estimation method. Two commonly used modelling approaches in the literature involve imposing linear restrictions on the covariance matrix or its inverse. Another approach considers linear restrictions on the matrix logarithm of the covariance matrix. In this paper, we present a general framework for linear restrictions on different transformations of the covariance matrix, including the mentioned examples. Our proposed estimation method solves a convex problem and yields an $M$-estimator, allowing for relatively straightforward asymptotic (in general) and finite sample analysis (in the Gaussian case). In particular, we recover standard $\sqrt{n/d}$ rates, where $d$ is the dimension of the underlying model. Our geometric insights allow to extend various recent results in covariance matrix modelling. This includes providing unrestricted parametrizations of the space of correlation matrices, which is alternative to a recent result utilizing the matrix logarithm.
- {barticle}[author] \bauthor\bsnmAméndola, \bfnmCarlos\binitsC. and \bauthor\bsnmZwiernik, \bfnmPiotr\binitsP. (\byear2021). \btitleLikelihood geometry of correlation models. \bjournalLe Matematiche \bvolume76 \bpages559–583. \endbibitem
- {barticle}[author] \bauthor\bsnmAndersen, \bfnmPer Kragh\binitsP. K. and \bauthor\bsnmGill, \bfnmRichard D\binitsR. D. (\byear1982). \btitleCox’s regression model for counting processes: a large sample study. \bjournalThe annals of statistics \bpages1100–1120. \endbibitem
- {barticle}[author] \bauthor\bsnmAnderson, \bfnmT. W.\binitsT. W. (\byear1973). \btitleAsymptotically efficient estimation of covariance matrices with linear structure. \bjournalAnnals of Statistics \bvolume1 \bpages135–141. \bmrnumber0331612 (48 ##9944) \endbibitem
- {btechreport}[author] \bauthor\bsnmAnderson, \bfnmTheodore W\binitsT. W. (\byear1978). \btitleMaximum likelihood estimation for vector autoregressive moving average models \btypeTechnical Report, \bpublisherSTANFORD UNIV CA DEPT OF STATISTICS. \endbibitem
- {barticle}[author] \bauthor\bsnmArchakov, \bfnmIlya\binitsI. and \bauthor\bsnmHansen, \bfnmPeter Reinhard\binitsP. R. (\byear2021). \btitleA new parametrization of correlation matrices. \bjournalEconometrica \bvolume89 \bpages1699–1715. \endbibitem
- {barticle}[author] \bauthor\bsnmAsai, \bfnmManabu\binitsM. and \bauthor\bsnmSo, \bfnmMike KP\binitsM. K. (\byear2015). \btitleLong memory and asymmetry for matrix-exponential dynamic correlation processes. \bjournalJournal of Time Series Econometrics \bvolume7 \bpages69–94. \endbibitem
- {barticle}[author] \bauthor\bsnmBanerjee, \bfnmOnureena\binitsO., \bauthor\bsnmGhaoui, \bfnmLaurent El\binitsL. E. and \bauthor\bsnmd’Aspremont, \bfnmAlexandre\binitsA. (\byear2008). \btitleModel selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. \bjournalJournal of Machine Learning Research \bvolume9 \bpages485–516. \endbibitem
- {bbook}[author] \bauthor\bsnmBarndorff-Nielsen, \bfnmOle Eiler\binitsO. E. (\byear1978). \btitleInformation and Exponential Families in Statistical Theory. \bpublisherWiley, \baddressNew York. \endbibitem
- {barticle}[author] \bauthor\bsnmBarratt, \bfnmShane\binitsS. and \bauthor\bsnmBoyd, \bfnmStephen\binitsS. (\byear2022). \btitleCovariance prediction via convex optimization. \bjournalOptimization and Engineering \bpages1–34. \endbibitem
- {barticle}[author] \bauthor\bsnmBattey, \bfnmHeather\binitsH. (\byear2017). \btitleEigen structure of a new class of covariance and inverse covariance matrices. \bjournalBernoulli \bvolume23 \bpages3166–3177. \bdoi10.3150/16-BEJ840 \bmrnumber3654802 \endbibitem
- {barticle}[author] \bauthor\bsnmBattey, \bfnmHS\binitsH. (\byear2019). \btitleOn sparsity scales and covariance matrix transformations. \bjournalBiometrika \bvolume106 \bpages605–617. \endbibitem
- {barticle}[author] \bauthor\bsnmBauer, \bfnmGregory H\binitsG. H. and \bauthor\bsnmVorkink, \bfnmKeith\binitsK. (\byear2011). \btitleForecasting multivariate realized stock market volatility. \bjournalJournal of Econometrics \bvolume160 \bpages93–101. \endbibitem
- {barticle}[author] \bauthor\bsnmBauschke, \bfnmHeinz H\binitsH. H. and \bauthor\bsnmBorwein, \bfnmJonathan M\binitsJ. M. (\byear1997). \btitleLegendre functions and the method of random Bregman projections. \bjournalJournal of convex analysis \bvolume4 \bpages27–67. \endbibitem
- {barticle}[author] \bauthor\bsnmBernstein, \bfnmDaniel Irving\binitsD. I., \bauthor\bsnmBlekherman, \bfnmGrigoriy\binitsG. and \bauthor\bsnmSinn, \bfnmRainer\binitsR. (\byear2020). \btitleTypical and generic ranks in matrix completion. \bjournalLinear Algebra and its Applications \bvolume585 \bpages71–104. \endbibitem
- {barticle}[author] \bauthor\bsnmBien, \bfnmJacob\binitsJ. and \bauthor\bsnmTibshirani, \bfnmRobert J\binitsR. J. (\byear2011). \btitleSparse estimation of a covariance matrix. \bjournalBiometrika \bvolume98 \bpages807–820. \endbibitem
- {barticle}[author] \bauthor\bsnmBlekherman, \bfnmGrigoriy\binitsG. and \bauthor\bsnmSinn, \bfnmRainer\binitsR. (\byear2019). \btitleMaximum likelihood threshold and generic completion rank of graphs. \bjournalDiscrete & Computational Geometry \bvolume61 \bpages303–324. \endbibitem
- {barticle}[author] \bauthor\bsnmBoik, \bfnmRobert J\binitsR. J. (\byear2002). \btitleSpectral models for covariance matrices. \bjournalBiometrika \bvolume89 \bpages159–182. \endbibitem
- {barticle}[author] \bauthor\bsnmBregman, \bfnmLev M\binitsL. M. (\byear1967). \btitleThe relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. \bjournalUSSR computational mathematics and mathematical physics \bvolume7 \bpages200–217. \endbibitem
- {barticle}[author] \bauthor\bsnmBrowne, \bfnmMichael W\binitsM. W. (\byear1974). \btitleGeneralized least squares estimators in the analysis of covariance structures. \bjournalSouth African Statistical Journal \bvolume8 \bpages1–24. \endbibitem
- {barticle}[author] \bauthor\bsnmBuhl, \bfnmSøren L\binitsS. L. (\byear1993). \btitleOn the existence of maximum likelihood estimators for graphical Gaussian models. \bjournalScandinavian Journal of Statistics \bpages263–270. \endbibitem
- {barticle}[author] \bauthor\bsnmCai, \bfnmT Tony\binitsT. T. and \bauthor\bsnmZhou, \bfnmHarrison H\binitsH. H. (\byear2012). \btitleOptimal rates of convergence for sparse covariance matrix estimation. \bjournalThe Annals of Statistics \bpages2389–2420. \endbibitem
- {barticle}[author] \bauthor\bsnmChaudhuri, \bfnmSanjay\binitsS., \bauthor\bsnmDrton, \bfnmMathias\binitsM. and \bauthor\bsnmRichardson, \bfnmThomas S\binitsT. S. (\byear2007). \btitleEstimation of a covariance matrix with zeros. \bjournalBiometrika \bvolume94 \bpages199–216. \endbibitem
- {barticle}[author] \bauthor\bsnmChiu, \bfnmTom YM\binitsT. Y., \bauthor\bsnmLeonard, \bfnmTom\binitsT. and \bauthor\bsnmTsui, \bfnmKam-Wah\binitsK.-W. (\byear1996). \btitleThe matrix-logarithmic covariance model. \bjournalJournal of the American Statistical Association \bvolume91 \bpages198–210. \endbibitem
- {barticle}[author] \bauthor\bsnmChristensen, \bfnmE. S.\binitsE. S. (\byear1989). \btitleStatistical Properties of I𝐼Iitalic_I-projections Within Exponential Families. \bjournalScandinavian Journal of Statistics \bvolume16 \bpages307–318. \endbibitem
- {barticle}[author] \bauthor\bsnmDavis, \bfnmChandler\binitsC. (\byear1957). \btitleAll convex invariant functions of hermitian matrices. \bjournalArchiv der Mathematik \bvolume8 \bpages276–278. \endbibitem
- {barticle}[author] \bauthor\bsnmDempster, \bfnmArthur P\binitsA. P. (\byear1972). \btitleCovariance selection. \bjournalBiometrics \bpages157–175. \endbibitem
- {barticle}[author] \bauthor\bsnmDeng, \bfnmXinwei\binitsX. and \bauthor\bsnmTsui, \bfnmKam-Wah\binitsK.-W. (\byear2013). \btitlePenalized covariance matrix estimation using a matrix-logarithm transformation. \bjournalJournal of Computational and Graphical Statistics \bvolume22 \bpages494–512. \endbibitem
- {barticle}[author] \bauthor\bsnmDhillon, \bfnmInderjit S\binitsI. S. and \bauthor\bsnmTropp, \bfnmJoel A\binitsJ. A. (\byear2008). \btitleMatrix nearness problems with Bregman divergences. \bjournalSIAM Journal on Matrix Analysis and Applications \bvolume29 \bpages1120–1146. \endbibitem
- {barticle}[author] \bauthor\bsnmDrton, \bfnmMathias\binitsM. and \bauthor\bsnmRichardson, \bfnmThomas S\binitsT. S. (\byear2008). \btitleGraphical methods for efficient likelihood inference in Gaussian covariance models. \bjournalJournal of Machine Learning Research \bvolume9 \bpages893–914. \endbibitem
- {barticle}[author] \bauthor\bsnmFriedman, \bfnmJerome\binitsJ., \bauthor\bsnmHastie, \bfnmTrevor\binitsT. and \bauthor\bsnmTibshirani, \bfnmRobert\binitsR. (\byear2008). \btitleSparse inverse covariance estimation with the graphical lasso. \bjournalBiostatistics \bvolume9 \bpages432–441. \endbibitem
- {barticle}[author] \bauthor\bsnmGeyer, \bfnmCharles J\binitsC. J. (\byear1994). \btitleOn the asymptotics of constrained M-estimation. \bjournalThe Annals of statistics \bpages1993–2010. \endbibitem
- {barticle}[author] \bauthor\bsnmGross, \bfnmE.\binitsE. and \bauthor\bsnmSullivant, \bfnmS.\binitsS. (\byear2018). \btitleThe maximum likelihood threshold of a graph. \bjournalBernoulli \bvolume24 \bpages386–407. \endbibitem
- {barticle}[author] \bauthor\bsnmHaberman, \bfnmShelby J\binitsS. J. (\byear1989). \btitleConcavity and estimation. \bjournalThe Annals of Statistics \bpages1631–1661. \endbibitem
- {barticle}[author] \bauthor\bsnmHan, \bfnmInsu\binitsI., \bauthor\bsnmAvron, \bfnmHaim\binitsH. and \bauthor\bsnmShin, \bfnmJinwoo\binitsJ. (\byear2018). \btitleStochastic chebyshev gradient descent for spectral optimization. \bjournalAdvances in Neural Information Processing Systems \bvolume31. \endbibitem
- {bbook}[author] \bauthor\bsnmHastie, \bfnmTrevor\binitsT., \bauthor\bsnmTibshirani, \bfnmRobert\binitsR. and \bauthor\bsnmWainwright, \bfnmMartin\binitsM. (\byear2015). \btitleStatistical learning with sparsity: the lasso and generalizations. \bpublisherCRC press. \endbibitem
- {bbook}[author] \bauthor\bsnmHiriart-Urruty, \bfnmJean-Baptiste\binitsJ.-B. and \bauthor\bsnmLemaréchal, \bfnmClaude\binitsC. (\byear2012). \btitleFundamentals of convex analysis. \bpublisherSpringer Science & Business Media. \endbibitem
- {barticle}[author] \bauthor\bsnmHøjsgaard, \bfnmSøren\binitsS. and \bauthor\bsnmLauritzen, \bfnmSteffen L\binitsS. L. (\byear2008). \btitleGraphical Gaussian models with edge and vertex symmetries. \bjournalJournal of the Royal Statistical Society: Series B (Statistical Methodology) \bvolume70 \bpages1005–1027. \endbibitem
- {barticle}[author] \bauthor\bsnmIshihara, \bfnmTsunehiro\binitsT., \bauthor\bsnmOmori, \bfnmYasuhiro\binitsY. and \bauthor\bsnmAsai, \bfnmManabu\binitsM. (\byear2016). \btitleMatrix exponential stochastic volatility with cross leverage. \bjournalComputational Statistics & Data Analysis \bvolume100 \bpages331–350. \endbibitem
- {barticle}[author] \bauthor\bsnmJennrich, \bfnmRobert I\binitsR. I. and \bauthor\bsnmSchluchter, \bfnmMark D\binitsM. D. (\byear1986). \btitleUnbalanced repeated-measures models with structured covariance matrices. \bjournalBiometrics \bpages805–820. \endbibitem
- {barticle}[author] \bauthor\bsnmJensen, \bfnmSoren Tolver\binitsS. T. (\byear1988). \btitleCovariance hypotheses which are linear in both the covariance and the inverse covariance. \bjournalThe Annals of Statistics \bvolume16 \bpages302–322. \endbibitem
- {barticle}[author] \bauthor\bsnmKauermann, \bfnmGöran\binitsG. (\byear1996). \btitleOn a dualization of graphical Gaussian models. \bjournalScandinavian journal of statistics \bpages105–116. \endbibitem
- {barticle}[author] \bauthor\bsnmKawakatsu, \bfnmHiroyuki\binitsH. (\byear2006). \btitleMatrix exponential GARCH. \bjournalJournal of Econometrics \bvolume134 \bpages95–128. \endbibitem
- {bbook}[author] \bauthor\bsnmLauritzen, \bfnmSteffen\binitsS. (\byear2023). \btitleFundamentals of Mathematical Statistics. \bpublisherCRC Press. \endbibitem
- {barticle}[author] \bauthor\bsnmLauritzen, \bfnmSteffen\binitsS. and \bauthor\bsnmZwiernik, \bfnmPiotr\binitsP. (\byear2022). \btitleLocally associated graphical models and mixed convex exponential families. \bjournalThe Annals of Statistics \bvolume50 \bpages3009–3038. \endbibitem
- {barticle}[author] \bauthor\bsnmLeonard, \bfnmTom\binitsT. and \bauthor\bsnmHsu, \bfnmJohn SJ\binitsJ. S. (\byear1992). \btitleBayesian inference for a covariance matrix. \bjournalThe Annals of Statistics \bvolume20 \bpages1669–1696. \endbibitem
- {barticle}[author] \bauthor\bsnmLeSage, \bfnmJames P\binitsJ. P. and \bauthor\bsnmPace, \bfnmR Kelley\binitsR. K. (\byear2007). \btitleA matrix exponential spatial specification. \bjournalJournal of Econometrics \bvolume140 \bpages190–214. \endbibitem
- {barticle}[author] \bauthor\bsnmLewis, \bfnmAdrian S\binitsA. S. (\byear1996a). \btitleConvex analysis on the Hermitian matrices. \bjournalSIAM Journal on Optimization \bvolume6 \bpages164–177. \endbibitem
- {barticle}[author] \bauthor\bsnmLewis, \bfnmAdrian S\binitsA. S. (\byear1996b). \btitleDerivatives of spectral functions. \bjournalMathematics of Operations Research \bvolume21 \bpages576–588. \endbibitem
- {barticle}[author] \bauthor\bsnmLewis, \bfnmAdrian S\binitsA. S. and \bauthor\bsnmSendov, \bfnmHristo S\binitsH. S. (\byear2001). \btitleTwice differentiable spectral functions. \bjournalSIAM Journal on Matrix Analysis and Applications \bvolume23 \bpages368–386. \endbibitem
- {barticle}[author] \bauthor\bsnmLin, \bfnmLina\binitsL., \bauthor\bsnmDrton, \bfnmMathias\binitsM. and \bauthor\bsnmShojaie, \bfnmAli\binitsA. (\byear2016). \btitleEstimation of high-dimensional graphical models using regularized score matching. \bjournalElectronic journal of statistics \bvolume10 \bpages806. \endbibitem
- {barticle}[author] \bauthor\bsnmLin, \bfnmZ\binitsZ., \bauthor\bsnmMüller, \bfnmH-G\binitsH.-G. and \bauthor\bsnmPark, \bfnmBU\binitsB. (\byear2023). \btitleAdditive models for symmetric positive-definite matrices and Lie groups. \bjournalBiometrika \bvolume110 \bpages361–379. \endbibitem
- {barticle}[author] \bauthor\bsnmLlorens-Terrazas, \bfnmJordi\binitsJ. and \bauthor\bsnmBrownlees, \bfnmChristian\binitsC. (\byear2022). \btitleProjected Dynamic Conditional Correlations. \bjournalInternational Journal of Forecasting. \endbibitem
- {barticle}[author] \bauthor\bsnmLugosi, \bfnmGábor\binitsG. and \bauthor\bsnmMendelson, \bfnmShahar\binitsS. (\byear2019). \btitleMean estimation and regression under heavy-tailed distributions: A survey. \bjournalFoundations of Computational Mathematics \bvolume19 \bpages1145–1190. \endbibitem
- {barticle}[author] \bauthor\bsnmNiemiro, \bfnmWojciech\binitsW. (\byear1992). \btitleAsymptotics for M-estimators defined by convex minimization. \bjournalThe Annals of Statistics \bpages1514–1533. \endbibitem
- {barticle}[author] \bauthor\bsnmPavlov, \bfnmDmitrii\binitsD. (\byear2023). \btitleLogarithmically Sparse Symmetric Matrices. \bjournalarXiv preprint arXiv:2301.10042. \endbibitem
- {barticle}[author] \bauthor\bsnmPavlov, \bfnmDmitrii\binitsD., \bauthor\bsnmSturmfels, \bfnmBernd\binitsB. and \bauthor\bsnmTelen, \bfnmSimon\binitsS. (\byear2022). \btitleGibbs Manifolds. \bjournalarXiv preprint arXiv:2211.15490. \endbibitem
- {barticle}[author] \bauthor\bsnmPearl, \bfnmJudea\binitsJ. and \bauthor\bsnmWermuth, \bfnmNanny\binitsN. (\byear1994). \btitleWhen can association graphs admit a causal interpretation. \bjournalSelecting Models from Data: Artificial Intelligence and Statistics IV \bvolume89 \bpages205–214. \endbibitem
- {barticle}[author] \bauthor\bsnmPourahmadi, \bfnmMohsen\binitsM. (\byear2000). \btitleMaximum likelihood estimation of generalised linear models for multivariate normal covariance matrix. \bjournalBiometrika \bvolume87 \bpages425–435. \endbibitem
- {barticle}[author] \bauthor\bsnmPourahmadi, \bfnmMohsen\binitsM. (\byear2011). \btitleCovariance estimation: the GLM and regularization perspectives. \bjournalStatist. Sci. \bvolume26 \bpages369–387. \bdoi10.1214/11-STS358 \bmrnumber2917961 \endbibitem
- {bbook}[author] \bauthor\bsnmPourahmadi, \bfnmMohsen\binitsM. (\byear2013). \btitleHigh-dimensional covariance estimation: with high-dimensional data \bvolume882. \bpublisherJohn Wiley & Sons. \endbibitem
- {barticle}[author] \bauthor\bsnmRavikumar, \bfnmPradeep\binitsP., \bauthor\bsnmWainwright, \bfnmMartin J\binitsM. J. and \bauthor\bsnmLafferty, \bfnmJohn D\binitsJ. D. (\byear2010). \btitleHigh-dimensional Ising model selection using ℓ1ℓ1\ell 1roman_ℓ 1-regularized logistic regression. \bjournalThe Annals of Statistics \bvolume38 \bpages1287–1319. \endbibitem
- {bbook}[author] \bauthor\bsnmRockafellar, \bfnmR Tyrrell\binitsR. T. (\byear1970). \btitleConvex analysis \bvolume28. \bpublisherPrinceton University Press. \endbibitem
- {barticle}[author] \bauthor\bsnmRossell, \bfnmDavid\binitsD. and \bauthor\bsnmZwiernik, \bfnmPiotr\binitsP. (\byear2020). \btitleDependence in elliptical partial correlation graphs. \bjournalarXiv preprint arXiv:2004.13779. \endbibitem
- {barticle}[author] \bauthor\bsnmRybak, \bfnmJakub\binitsJ. and \bauthor\bsnmBattey, \bfnmHeather S\binitsH. S. (\byear2021). \btitleSparsity induced by covariance transformation: some deterministic and probabilistic results. \bjournalProceedings of the Royal Society A \bvolume477 \bpages20200756. \endbibitem
- {barticle}[author] \bauthor\bsnmSturmfels, \bfnmBernd\binitsB., \bauthor\bsnmTimme, \bfnmSascha\binitsS. and \bauthor\bsnmZwiernik, \bfnmPiotr\binitsP. (\byear2019). \btitleEstimating linear covariance models with numerical nonlinear algebra. \bjournalarXiv preprint arXiv:1909.00566. \endbibitem
- {barticle}[author] \bauthor\bsnmSullivant, \bfnmSeth\binitsS., \bauthor\bsnmTalaska, \bfnmKelli\binitsK. and \bauthor\bsnmDraisma, \bfnmJan\binitsJ. (\byear2010). \btitleTrek separation for Gaussian graphical models. \bjournalAnn. Statist. \bvolume38 \bpages1665–1685. \bdoi10.1214/09-AOS760 \bmrnumber2662356 \endbibitem
- {barticle}[author] \bauthor\bsnmSzatrowski, \bfnmTed H\binitsT. H. (\byear1978). \btitleExplicit solutions, one iteration convergence and averaging in the multivariate normal estimation problem for patterned means and covariance. \bjournalAnnals of the Institute of Statistical Mathematics \bvolume30 \bpagesp81–88. \endbibitem
- {barticle}[author] \bauthor\bsnmSzatrowski, \bfnmTed H\binitsT. H. (\byear1980). \btitleNecessary and sufficient conditions for explicit solutions in the multivariate normal estimation problem for patterned means and covariances. \bjournalThe Annals of Statistics \bpages802–810. \endbibitem
- {barticle}[author] \bauthor\bsnmSzatrowski, \bfnmTed H\binitsT. H. (\byear2004). \btitlePatterned covariances. \bjournalEncyclopedia of statistical sciences \bvolume9. \endbibitem
- {barticle}[author] \bauthor\bsnmUhler, \bfnmCaroline\binitsC. (\byear2012). \btitleGeometry of maximum likelihood estimation in Gaussian graphical models. \bjournalAnn. Statist. \bvolume40 \bpages238–261. \endbibitem
- {bbook}[author] \bauthor\bsnmWainwright, \bfnmMartin J\binitsM. J. (\byear2019). \btitleHigh-dimensional statistics: A non-asymptotic viewpoint \bvolume48. \bpublisherCambridge University Press. \endbibitem
- {barticle}[author] \bauthor\bsnmWatkins, \bfnmWilliam\binitsW. (\byear1974). \btitleConvex matrix functions. \bjournalProceedings of the American Mathematical Society \bvolume44 \bpages31–34. \endbibitem
- {barticle}[author] \bauthor\bsnmYuan, \bfnmMing\binitsM. and \bauthor\bsnmLin, \bfnmYi\binitsY. (\byear2007). \btitleModel selection and estimation in the Gaussian graphical model. \bjournalBiometrika \bvolume94 \bpages19–35. \endbibitem
- {barticle}[author] \bauthor\bsnmZwiernik, \bfnmPiotr\binitsP., \bauthor\bsnmUhler, \bfnmCaroline\binitsC. and \bauthor\bsnmRichards, \bfnmDonald\binitsD. (\byear2017). \btitleMaximum likelihood estimation for linear Gaussian covariance models. \bjournalJournal of the Royal Statistical Society. Series B: Statistical Methodology \bvolume79. \endbibitem