Differentially Private Conditional Independence Testing (2306.06721v3)
Abstract: Conditional independence (CI) tests are widely used in statistical data analysis, e.g., they are the building block of many algorithms for causal graph discovery. The goal of a CI test is to accept or reject the null hypothesis that $X \perp !!! \perp Y \mid Z$, where $X \in \mathbb{R}, Y \in \mathbb{R}, Z \in \mathbb{R}d$. In this work, we investigate conditional independence testing under the constraint of differential privacy. We design two private CI testing procedures: one based on the generalized covariance measure of Shah and Peters (2020) and another based on the conditional randomization test of Cand`es et al. (2016) (under the model-X assumption). We provide theoretical guarantees on the performance of our tests and validate them empirically. These are the first private CI tests with rigorous theoretical guarantees that work for the general case when $Z$ is continuous.
- Differentially private uniformly most powerful tests for binomial data. In Advances in Neural Information Processing Systems (NeurIPS), pages 4212–4222, 2018.
- Differentially private significance tests for regression coefficients. Journal of Computational and Graphical Statistics, 28:440 – 453, 2017.
- The conditional permutation test for independence while controlling for confounders. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 82, 2019.
- Impossibility of differentially private universally optimal mechanisms. SIAM Journal on Computing (SICOMP), 43(5):1513–1540, 2014.
- Differentially private ANOVA testing. In International Conference on Data Intelligence and Security, pages 281–285, 2018.
- Panning for gold: ‘model‐X’ knockoffs for high dimensional controlled variable selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 80, 2016.
- Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21:C1–C68, 2018.
- Differentially private nonparametric hypothesis testing. In Proceedings of the ACM Conference on Computer and Communications Security, CCS, pages 737–751, 2019.
- A Philip Dawid. Conditional independence in statistical theory. Journal of the Royal Statistical Society: Series B (Methodological), 41(1):1–15, 1979.
- The Permute-and-Flip mechanism is identical to Report-Noisy-Max with exponential noise. arXiv 2105.07260, 2021.
- The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci., 9(3-4):211–407, 2014.
- On the complexity of differentially private data release: efficient algorithms and hardness results. In Proceedings, ACM Symposium on Theory of Computing (STOC), pages 381–390, 2009.
- Calibrating noise to sensitivity in private data analysis. Journal of Privacy and Confidentiality, 7(3):17–51, 2016.
- Differential privacy and the risk-utility tradeoff for multi-dimensional contingency tables. In Privacy in Statistical Databases, volume 6344 of Lecture Notes in Computer Science, pages 187–199. Springer, 2010.
- Kernel measures of conditional dependence. Advances in Neural Information Processing Systems (NeurIPS), 20, 2007.
- Differentially private Chi-Squared hypothesis testing: Goodness of fit and independence testing. In Proceedings, International Conference on Machine Learning (ICML), volume 48, 2016.
- Privacy-preserving data exploration in genome-wide association studies. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1079–1087, 2013.
- The test of tests: A framework for differentially private hypothesis testing. In Proceedings, International Conference on Machine Learning (ICML), volume 202, pages 16131–16151, 2023.
- Differentially private permutation tests: Applications to kernel methods. CoRR, abs/2310.19043, 2023. doi: 10.48550/ARXIV.2310.19043. URL https://doi.org/10.48550/arXiv.2310.19043.
- Probabilistic graphical models: principles and techniques. MIT Press, 2009.
- Private causal inference. In Proceedings, International Conference on Artificial Intelligence and Statistics (AISTATS), volume 51, pages 1308–1317, 2016.
- Just interpolate: Kernel “ridgeless” regression can generalize. The Annals of Statistics, 48(3):1329 – 1347, 2020.
- Private selection from private candidates. In Proceedings, ACM Symposium on Theory of Computing (STOC), pages 298–309. ACM, 2019.
- Permute-and-flip: A new mechanism for differentially private selection. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- Smooth sensitivity and sampling in private data analysis. In Proceedings, ACM Symposium on Theory of Computing (STOC), pages 75–84. ACM, 2007.
- Judea Pearl. Models, reasoning and inference. Cambridge, UK: Cambridge University Press, 19(2), 2000.
- Differentially private hypothesis testing with the subsampled and aggregated randomized response mechanism. Statistica Sinica, 2022.
- A new class of private chi-square hypothesis tests. In Proceedings, International Conference on Artificial Intelligence and Statistics (AISTATS), volume 54, pages 991–1000, 2017.
- The weighted generalised covariance measure. Journal of Machine Learning Research, 23(273):1–68, 2022.
- The hardness of conditional independence testing and the generalised covariance measure. The Annals of Statistics, 48(3), 2020.
- Adam Smith. Privacy-preserving statistical estimation with optimal convergence rates. In Proceedings, ACM Symposium on Theory of Computing (STOC), pages 813–822, 2011.
- Approximate kernel-based conditional independence tests for fast non-parametric causal discovery. Journal of Causal Inference, 7(1), 2019.
- Improved differentially private analysis of variance. Proc. Priv. Enhancing Technol., 2019(3):310–330, 2019.
- Privacy-preserving data sharing for genome-wide association studies. J. Priv. Confidentiality, 5(1), 2013.
- Private independence testing across two parties. CoRR, abs/2207.03652, 2022.
- Differential privacy for clinical trial data: Preliminary evaluations. In ICDM Workshops, pages 138–143, 2009.
- Towards practical differentially private causal graph discovery. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- Revisiting differentially private hypothesis tests for categorical data. arXiv 1511.03376, 2015.
- Statistical approximating distributions under differential privacy. J. Priv. Confidentiality, 8(1), 2018.
- I-Cheng Yeh. Concrete Compressive Strength. UCI Machine Learning Repository, 2007. DOI: https://doi.org/10.24432/C5PK67.
- Scalable privacy-preserving data sharing methodology for genome-wide association studies. J. Biomed. Informatics, 50:133–141, 2014.
- Kernel-based conditional independence test and application in causal discovery. In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), pages 804–813, 2011.
- Iden Kalemaj (6 papers)
- Shiva Prasad Kasiviswanathan (28 papers)
- Aaditya Ramdas (180 papers)