Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On Sufficient Graphical Models (2307.04353v1)

Published 10 Jul 2023 in stat.ML and cs.LG

Abstract: We introduce a sufficient graphical model by applying the recently developed nonlinear sufficient dimension reduction techniques to the evaluation of conditional independence. The graphical model is nonparametric in nature, as it does not make distributional assumptions such as the Gaussian or copula Gaussian assumptions. However, unlike a fully nonparametric graphical model, which relies on the high-dimensional kernel to characterize conditional independence, our graphical model is based on conditional independence given a set of sufficient predictors with a substantially reduced dimension. In this way we avoid the curse of dimensionality that comes with a high-dimensional kernel. We develop the population-level properties, convergence rate, and variable selection consistency of our estimate. By simulation comparisons and an analysis of the DREAM 4 Challenge data set, we demonstrate that our method outperforms the existing methods when the Gaussian or copula Gaussian assumptions are violated, and its performance remains excellent in the high-dimensional setting.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Kernel independent component analysis. Journal of Machine Learning Research, 3:1–48, 2002.
  2. R Bellman. Curse of dimensionality. Adaptive control processes: a guided tour. Princeton, NJ, 1961.
  3. Covariance regularization by thresholding. The Annals of Statistics, pages 2577–2604, 2008.
  4. R Dennis Cook. Using dimension-reduction subspaces to identify important inputs in models of physical systems. In Proceedings of the section on Physical and Engineering Sciences, pages 18–25. American Statistical Association Alexandria, VA, 1994.
  5. Comment. Journal of the American Statistical Association, 86(414):328–332, 1991.
  6. A. P. Dawid. Conditional independence in statistical theory. Journal of the Royal Statistical Society. Series B (Methodological), pages 1–31, 1979.
  7. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American statistical Association, 96(456):1348–1360, 2001.
  8. Stable graphical model estimation with random forests for discrete, continuous, and mixed variables. Computational Statistics & Data Analysis, 64:132–152, 2013.
  9. Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3):432–441, 2008.
  10. Dimensionality reduction for supervised learning with reproducing kernel hilbert spaces. Journal of Machine Learning Research, 5(Jan):73–99, 2004.
  11. Statistical consistency of kernel canonical correlation analysis. Journal of Machine Learning Research, 8(Feb):361–383, 2007.
  12. Kernel measures of conditional dependence. In Advances in neural information processing systems, pages 489–496, 2008.
  13. Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics, 1979.
  14. Pairwise variable selection for high-dimensional model-based clustering. Biometrics, 66(3):793–804, 2010.
  15. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12:55–67, 1970.
  16. On post dimension reduction statistical inference. to appear in The Annals of Statistics, 2020.
  17. Sparsistency and rates of convergence in large covariance matrix estimation. Annals of statistics, 37(6B):4254, 2009.
  18. Steffen L Lauritzen. Graphical models, volume 17. Clarendon Press, 1996.
  19. Variable selection via additive conditional independence. Journal of the Royal Statistical Society: Series B, 78:1037–1055, 2016a.
  20. A general theory for nonlinear sufficient dimension reduction: Formulation and estimation. The Annals of Statistics, 41(1):221–249, 2013.
  21. On an additive partial correlation operator and nonparametric estimation of graphical models. Biometrika, 103(3):513–530, 2016b.
  22. Principal support vector machines for linear and nonlinear sufficient dimension reduction. The Annals of Statistics, 39:3182–3210, 2011.
  23. Bing Li. Linear operator-based statistical analysis: A useful paradigm for big data. Canadian Journal of Statistics, 46(1):79–103, 2018a.
  24. Bing Li. Sufficient Dimension Reduction: Methods and Applications with R. CRC Press, 2018b.
  25. A nonparametric graphical model for functional data with application to brain networks based on fmri. Journal of the American Statistical Association, 113(just-accepted):1637–1655, 2018a.
  26. A nonparametric graphical model for functional data with application to brain networks based on fmri. Journal of the American Statistical Association, 113:1637–1655, 2018b.
  27. Nonlinear sufficient dimension reduction for functional data. The Annals of Statistics, 45(3):1059–1095, 2017.
  28. On an additive semigraphoid model for statistical networks with application to pathway analysis. Journal of the American Statistical Association, 109(507):1188–1204, 2014.
  29. Ker-Chau Li. Sliced inverse regression for dimension reduction. Journal of the American Statistical Association, 86(414):316–327, 1991.
  30. The nonparanormal: Semiparametric estimation of high dimensional undirected graphs. Journal of Machine Learning Research, 10(Oct):2295–2328, 2009.
  31. High-dimensional semiparametric gaussian copula graphical models. The Annals of Statistics, 40(4):2293–2326, 2012a.
  32. Transelliptical graphical models. In Advances in neural information processing systems, pages 800–808, 2012b.
  33. Combining eigenvalues and variation of eigenvectors for order determination. Biometrika, 103:875–887, 2016.
  34. On order determination by predictor augmentation. Biometrika (To appear), 103:875–887, 2020.
  35. Revealing strengths and weaknesses of methods for gene network inference. Proceedings of the national academy of sciences, 107(14):6286–6291, 2010.
  36. High-dimensional graphs and variable selection with the lasso. The annals of statistics, pages 1436–1462, 2006.
  37. J. Pearl and T. Verma. The logic of representing dependencies by directed graphs. University of California (Los Angeles). Computer Science Department, 1987.
  38. Partial correlation estimation by joint sparse regression models. Journal of the American Statistical Association, 104(486):735–746, 2009.
  39. From knockouts to networks: establishing direct cause-effect relationships through graph analysis. PLoS One, 5(10), 2010.
  40. Functional graphical models. Journal of the American Statistical Association, 114:211–222, 2019.
  41. Copula gaussian graphical models for functional data. submitted to Journal of the American Statistical Association, (under revision), 2020.
  42. Universality, characteristic kernels and rkhs embedding of measures. Journal of Machine Learning Research, 12, 2011.
  43. Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), pages 267–288, 1996.
  44. Graph estimation with joint additive models. Biometrika, 101(1):85–101, 2013.
  45. Y. Wang. Nonlinear dimension reduction in feature space. PhD Thesis, The Pennsylvania State University, 2008.
  46. H. M. Wu. Kernel sliced inverse regression with applications to classification. Journal of Computational and Graphical Statistics, 17(3):590–610, 2008.
  47. Regularized rank-based estimation of high-dimensional nonparanormal graphical models. The Annals of Statistics, 40(5):2541–2571, 2012.
  48. Ming Yuan and Yi Lin. Model selection and estimation in the gaussian graphical model. Biometrika, 94(1):19–35, 2007.
  49. Bayesian graphical models for multivariate functional data. Journal of Machine Learning Research, 14:1–27, 2016.
  50. Hui Zou. The adaptive lasso and its oracle properties. Journal of the American statistical association, 101(476):1418–1429, 2006.

Summary

We haven't generated a summary for this paper yet.