2000 character limit reached
Practical Kernel Tests of Conditional Independence (2402.13196v1)
Published 20 Feb 2024 in cs.LG
Abstract: We describe a data-efficient, kernel-based approach to statistical testing of conditional independence. A major challenge of conditional independence testing, absent in tests of unconditional independence, is to obtain the correct test level (the specified upper bound on the rate of false positives), while still attaining competitive test power. Excess false positives arise due to bias in the test statistic, which is obtained using nonparametric kernel ridge regression. We propose three methods for bias control to correct the test level, based on data splitting, auxiliary data, and (where possible) simpler function classes. We show these combined strategies are effective both for synthetic and real-world data.
- Adaptive test of independence based on hsic measures. The Annals of Statistics, 50(2):858–879, 2022.
- Minority neighborhoods pay higher car insurance premiums than white areas with the same risk. ProPublica, April 2017.
- The conditional permutation test for independence while controlling for confounders. Journal of the Royal Statistical Society Series B: Statistical Methodology, 82(1):175–197, 2020.
- MMD-FUSE: Learning and combining kernels for two-sample testing without data splitting. In NeurIPS, 2023.
- A comparison of efficient approximations for a weighted sum of chi-squared random variables. Statistics and Computing, 26(4):917–928, 2016.
- Panning for gold: ‘model-x’ knockoffs for high dimensional controlled variable selection. Journal of the Royal Statistical Society Series B: Statistical Methodology, 80(3):551–577, 2018.
- A wild bootstrap for degenerate kernel tests. In NeurIPS, 2014.
- JJ Daudin. Partial association measures and an application to qualitative regression. Biometrika, 67(3):581–590, 1980.
- A general framework for the analysis of kernel-based tests, 2022.
- Sobolev norm learning rates for regularized least-squares algorithms. The Journal of Machine Learning Research, 21(1):8464–8501, 2020.
- Kernels based tests with non-asymptotic bootstrap approaches for two-sample problems. In Conference on Learning Theory, 2012.
- Dimensionality reduction for supervised learning with reproducing kernel hilbert spaces. Journal of Machine Learning Research, 5(Jan):73–99, 2004.
- Kernel measures of conditional dependence. In NeurIPS, volume 20, 2007.
- Arthur Gretton. Introduction to RKHS, and some simple kernel algorithms. Advanced Topics in Machine Learning lecture, University College London, 2013.
- Measuring statistical dependence with Hilbert-Schmidt norms. In ALT, pages 63–77, 2005.
- A kernel statistical test of independence. In NeurIPS, 2007.
- Conditional mean embeddings as regressors. In ICML, 2012.
- Kernel partial correlation coefficient — a measure of conditional dependence. J. Mach. Learn. Res., 23(216):1–58, 2022.
- Composite goodness-of-fit tests with kernels. arXiv preprint arXiv:2111.10275, 2021.
- Differentially private permutation tests: Applications to kernel methods. arXiv preprint arXiv:2310.19043, 2023.
- Local permutation tests for conditional independence. The Annals of Statistics, 50(6):3388–3414, 2022.
- Dependent wild bootstrap for degenerate U- and V-statistics. Journal of Multivariate Analysis, 117:257–280, 2013.
- On the saturation effect of kernel ridge regression. In ICLR, 2022a.
- Optimal rates for regularized conditional mean embedding learning. In NeurIPS, 2022b.
- Towards optimal Sobolev norm rates for the vector-valued regularized least-squares algorithm. arXiv preprint arXiv:2312.07186, 2023.
- Learning deep kernels for non-parametric two-sample tests. In ICML, 2020.
- Proximal causal learning with kernels: Two-stage estimation and moment restriction. In ICML, 2021.
- A survey on bias and fairness in machine learning. ACM computing surveys (CSUR), 54(6):1–35, 2021.
- Minimax optimal conditional independence testing. The Annals of Statistics, 49(4):2151–2177, 2021.
- A measure-theoretic approach to kernel conditional mean embeddings. In NeurIPS, 2020.
- Judea Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, 2000.
- Efficient conditionally invariant representation learning. In ICLR, 2022.
- Conditional independence testing under misspecified inductive biases. In NeurIPS, 2023.
- An asymptotic test for conditional independence using analytic kernel embeddings. In ICML, 2022.
- The weighted generalised covariance measure. The Journal of Machine Learning Research, 23(1):12517–12584, 2022.
- KSD aggregated goodness-of-fit test. In NeurIPS, 2022a.
- Efficient aggregated kernel tests using incomplete U𝑈Uitalic_U-statistics. In NeurIPS, 2022b.
- MMD aggregated two-sample test. Journal of Machine Learning Research, 24(194):1–81, 2023.
- Model-powered conditional independence test. In NeurIPS, 2017.
- The hardness of conditional independence testing and the generalised covariance measure. The Annals of Statistics, 48(3):1514–1538, 2020.
- Xiaofeng Shao. The dependent wild bootstrap. Journal of the American Statistical Association, 105(489):218–235, 2010.
- Supervised feature selection via dependence estimation. In ICML, 2007.
- Hilbert space embeddings of conditional distributions. In ICML, 2009.
- Causation, Prediction, and Search. Springer, 2nd edition, 2000.
- Universality, characteristic kernels and RKHS embedding of measures. JMLR, 12:2389–2410, 2011.
- Mercer’s theorem on general domains: On the interaction between measures, kernels, and RKHSs. Constructive Approximation, 35:363–417, 2012.
- Approximate kernel-based conditional independence tests for fast non-parametric causal discovery. Journal of Causal Inference, 7(1):20180017, 2019.
- A kernel-based causal learning algorithm. In ICML, 2007.
- Nonlinear directed acyclic structure learning with weakly additive noise models. In NeurIPS, 2009.
- Chien-Fu Jeff Wu. Jackknife, bootstrap and other resampling methods in regression analysis. The Annals of Statistics, 14(4):1261–1295, 1986.
- Kernel-based conditional independence test and application in causal discovery. In UAI, 2011.