Optimal subsample size for subsampling-based critical value estimation
Determine a principled method for choosing the subsample size n_B in the subsampling algorithm used to generate null-distribution critical values for kernel-based quadratic distance two-sample and k-sample tests, specifying how n_B should depend on sample size, dimensionality, and test settings to provide clear, reproducible guidance for practitioners.
Sponsor
References
There is no clear guidance for the choice of the "optimal" subsample size $n_B$ and, the literature investigates this aspect according to optimal subsampling probabilities formulated by minimizing some function of the asymptotic distribution.
— Goodness-of-Fit and Clustering of Spherical Data: the QuadratiK package in R and Python
(2402.02290 - Saraceno et al., 3 Feb 2024) in Subsection k-Sample Tests, Section 3 (Multivariate KBQD tests)