On the quality of randomized approximations of Tukey's depth (2309.05657v2)
Abstract: Tukey's depth (or halfspace depth) is a widely used measure of centrality for multivariate data. However, exact computation of Tukey's depth is known to be a hard problem in high dimensions. As a remedy, randomized approximations of Tukey's depth have been proposed. In this paper we explore when such randomized algorithms return a good approximation of Tukey's depth. We study the case when the data are sampled from a log-concave isotropic distribution. We prove that, if one requires that the algorithm runs in polynomial time in the dimension, the randomized algorithm correctly approximates the maximal depth $1/2$ and depths close to zero. On the other hand, for any point of intermediate depth, any good approximation requires exponential complexity.
- Greg Aloupis. Geometric measures of data depth. DIMACS series in discrete mathematics and theoretical computer science, 72:147, 2006.
- The complexity and approximability of finding maximum feasible subsystems of linear relations. Theoretical Computer Science, 147(1-2):181–210, 1995.
- Concentration inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, 2013.
- Half-space depth of log-concave probability measures, 2022. URL https://arxiv.org/abs/2201.11992.
- Output-sensitive algorithms for Tukey depth and related problems. Statistics and Computing, 18:259–266, 2008.
- Deterministic and randomized polynomial-time approximation of radii. Mathematika. A Journal of Pure and Applied Mathematics, 48(1-2):63–105, 2001.
- Victor-Emmanuel Brunel. Concentration of the empirical level sets of Tukey’s halfspace depth. Probability Theory and Related Fields, 173(3):1165–1196, 2019.
- Timothy M Chan. An optimal randomized algorithm for maximum Tukey depth. In SODA, volume 4, pages 430–436, 2004.
- Absolute approximation of Tukey depth: Theory and experiments. Computational Geometry, 46(5):566–573, 2013.
- Robust covariance and scatter matrix estimation under Huber’s contamination model. The Annals of Statistics, 46(5):1932–1960, 2018.
- The random Tukey depth. Computational Statistics & Data Analysis, 52(11):4979–4988, 2008.
- Theoretical properties of the log-concave maximum likelihood estimator of a multidimensional density. Electronic Journal of Statistics, 4:254 – 270, 2010.
- David Donoho. Breakdown properties of multivariate location estimators. Technical report, Technical report, Harvard University, 1982.
- Breakdown Properties of Location Estimates Based on Halfspace Depth and Projected Outlyingness. The Annals of Statistics, 20(4):1803 – 1827, 1992.
- Exact computation of the halfspace depth. Computational Statistics & Data Analysis, 98:19–30, 2016.
- Zonoid data depth: Theory and computation. In COMPSTAT: Proceedings in Computational Statistics, pages 235–240. Springer, 1996.
- Approximate computation of projection depths, 2020. URL https://arxiv.org/abs/2007.08016.
- Bounding the norm of a log-concave vector via thin-shell estimates. In Geometric Aspects of Functional Analysis: Israel Seminar (GAFA) 2011-2013, pages 107–122. Springer, 2014.
- Paul Funk. Über eine geometrische Anwendung der Abelschen Integralgleichung. Mathematische Annalen, 77(1):129–135, 1915.
- Robert D Gordon. Values of Mills’ ratio of area to bounding ordinate and of the normal probability integral for large values of the argument. The Annals of Mathematical Statistics, 12(3):364–366, 1941.
- The densest hemisphere problem. Theoretical Computer Science, 6(1):93–107, 1978.
- B. Klartag. A central limit theorem for convex sets. Inventiones Mathematicae, 168(1):91–131, 2007a.
- B. Klartag. Power-law estimates for the central limit theorem for convex sets. Journal of Functional Analysis, 245(1):284–310, 2007b.
- Bourgain’s slicing problem and KLS isoperimetry up to polylog. Geometric and Functional Analysis, 32(5):1134–1159, 2022.
- Zonoid trimming for multivariate distributions. The Annals of Statistics, 25(5):1998–2017, 1997.
- M. Ledoux. The Concentration of Measure Phenomenon. American Mathematical Society, 2001.
- P. Lévy. Problèmes conrets d’analyse fonctionelle. Gauthier-Villars, 1951.
- Regina Y Liu. On a notion of simplicial depth. Proceedings of the National Academy of Sciences, 85(6):1732–1734, 1988.
- Regina Y Liu. On a notion of data depth based on random simplices. The Annals of Statistics, pages 405–414, 1990.
- Regina Y Liu. Data depth and multivariate rank tests. L1-statistical analysis and related methods, pages 279–294, 1992.
- Erwin Lutwak. Chapter 1.5 - selected affine isoperimetric inequalities. In P.M. GRUBER and J.M. WILLS, editors, Handbook of Convex Geometry, pages 151–176. North-Holland, Amsterdam, 1993.
- J. Matoušek. Lectures on Discrete Geometry. Springer, 2002.
- Karl Mosler. Multivariate dispersion, central regions, and depth: the lift zonoid approach, volume 165. Springer Science & Business Media, 2002.
- Choosing among notions of multivariate depth statistics, 2021.
- Data depth and floating body. Statistics Surveys, 13, 2019.
- Uniform convergence rates for the approximated halfspace and projection depth. Electronic Journal of Statistics, 14(2):3939–3975, 2020.
- A. Prékopa. On logarithmic concave measures and functions. Acta Sci. Math.(Szeged), 34:335–343, 1973.
- Richard J. Samworth. Recent Progress in Log-Concave Density Estimation. Statistical Science, 33(4):493 – 509, 2018.
- Log-concavity and strong log-concavity: a review. Statistics Surveys, 8:45, 2014.
- Erhard Schmidt. Die Brunn-Minkowskische Ungleichung und ihr Spiegelbild sowie die isoperimetrische Eigenschaft der Kugel in der euklidischen und nichteuklidischen Geometrie. I. Mathematische Nachrichten, 1(2-3):81–157, 1948.
- Rolf Schneider. Functional equations connected with rotations and their geometric applications. Enseignenment Math.(2), 16:297–305, 1970.
- Employing the MCMC technique to compute the projection depth in high dimensions. Journal of Computational and Applied Mathematics, 411:114278, 2022.
- Werner A Stahel. Robuste schätzungen: infinitesimale optimalität und schätzungen von kovarianzmatrizen. PhD thesis, ETH Zürich, 1981.
- J. W. Tukey. Mathematics and the picturing of data. Proceedings of the International Congress of Mathematicians, Vancouver, 1975, 2:523–531, 1975. URL https://ci.nii.ac.jp/naid/10029477185/en/.
- Yijun Zuo. A new approach for the computation of halfspace depth in high dimensions. Communications in Statistics - Simulation and Computation, 48(3):900–921, 2019.
- General notions of statistical depth function. Annals of Statistics, pages 461–482, 2000a.
- On the performance of some robust nonparametric location measures relative to a general notion of multivariate symmetry. Journal of Statistical Planning and Inference, 84(1-2):55–79, 2000b.