Detecting Correlated Gaussian Databases (2206.12011v1)
Abstract: This paper considers the problem of detecting whether two databases, each consisting of $n$ users with $d$ Gaussian features, are correlated. Under the null hypothesis, the databases are independent. Under the alternate hypothesis, the features are correlated across databases, under an unknown row permutation. A simple test is developed to show that detection is achievable above $\rho2 \approx \frac{1}{d}$. For the converse, the truncated second moment method is used to establish that detection is impossible below roughly $\rho2 \approx \frac{1}{d\sqrt{n}}$. These results are compared to the corresponding recovery problem, where the goal is to decode the row permutation, and a converse bound of roughly $\rho2 \approx 1 - n{-4/d}$ has been previously shown. For certain choices of parameters, the detection achievability bound outperforms this recovery converse bound, demonstrating that detection can be easier than recovery in this scenario.