Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A faster subquadratic algorithm for finding outlier correlations (1510.03895v2)

Published 13 Oct 2015 in cs.DS

Abstract: We study the problem of detecting outlier pairs of strongly correlated variables among a collection of $n$ variables with otherwise weak pairwise correlations. After normalization, this task amounts to the geometric task where we are given as input a set of $n$ vectors with unit Euclidean norm and dimension $d$, and for some constants $0<\tau<\rho<1$, we are asked to find all the outlier pairs of vectors whose inner product is at least $\rho$ in absolute value, subject to the promise that all but at most $q$ pairs of vectors have inner product at most $\tau$ in absolute value. Improving on an algorithm of G. Valiant [FOCS 2012; J. ACM 2015], we present a randomized algorithm that for Boolean inputs (${-1,1}$-valued data normalized to unit Euclidean length) runs in time [ \tilde O\bigl(n{\max\,{1-\gamma+M(\Delta\gamma,\gamma),\,M(1-\gamma,2\Delta\gamma)}}+qdn{2\gamma}\bigr)\,, ] where $0<\gamma<1$ is a constant tradeoff parameter and $M(\mu,\nu)$ is the exponent to multiply an $\lfloor n\mu\rfloor\times\lfloor n\nu\rfloor$ matrix with an $\lfloor n\nu\rfloor\times \lfloor n\mu\rfloor$ matrix and $\Delta=1/(1-\log_\tau\rho)$. As corollaries we obtain randomized algorithms that run in time [ \tilde O\bigl(n{\frac{2\omega}{3-\log_\tau\rho}}+qdn{\frac{2(1-\log_\tau\rho)}{3-\log_\tau\rho}}\bigr) ] and in time [ \tilde O\bigl(n{\frac{4}{2+\alpha(1-\log_\tau\rho)}}+qdn{\frac{2\alpha(1-\log_\tau\rho)}{2+\alpha(1-\log_\tau\rho)}}\bigr)\,, ] where $2\leq\omega<2.38$ is the exponent for square matrix multiplication and $0.3<\alpha\leq 1$ is the exponent for rectangular matrix multiplication. The notation $\tilde O(\cdot)$ hides polylogarithmic factors in $n$ and $d$ whose degree may depend on $\rho$ and $\tau$. We present further corollaries for the light bulb problem and for learning sparse Boolean functions.

Citations (37)

Summary

We haven't generated a summary for this paper yet.