A limit process for partial match queries in random quadtrees and $2$-d trees (1202.1342v5)
Abstract: We consider the problem of recovering items matching a partially specified pattern in multidimensional trees (quadtrees and $k$-d trees). We assume the traditional model where the data consist of independent and uniform points in the unit square. For this model, in a structure on $n$ points, it is known that the number of nodes $C_n(\xi )$ to visit in order to report the items matching a random query $\xi$, independent and uniformly distributed on $[0,1]$, satisfies $\mathbf {E}[{C_n(\xi )}]\sim\kappa n{\beta}$, where $\kappa$ and $\beta$ are explicit constants. We develop an approach based on the analysis of the cost $C_n(s)$ of any fixed query $s\in[0,1]$, and give precise estimates for the variance and limit distribution of the cost $C_n(x)$. Our results permit us to describe a limit process for the costs $C_n(x)$ as $x$ varies in $[0,1]$; one of the consequences is that $\mathbf {E}[{\max_{x\in[0,1]}C_n(x)}]\sim \gamma n{\beta}$; this settles a question of Devroye [Pers. Comm., 2000].