Selection on $X_1+X_2+\cdots + X_m$ with layer-ordered heaps (1910.11993v2)
Abstract: Selection on $X_1+X_2+\cdots + X_m$ is an important problem with many applications in areas such as max-convolution, max-product Bayesian inference, calculating most probable isotopes, and computing non-parametric test statistics, among others. Faster-than-na\"{i}ve approaches exist for $m=2$: Frederickson (1993) published the optimal algorithm with runtime $O(k)$ and Kaplan \emph{et al.} (2018) has since published a much simpler algorithm which makes use of Chazelle's soft heaps (2003). No fast methods exist for $m>2$. Johnson & Mizoguchi (1978) introduced a method to compute the single $k{th}$ value when $m>2$, but that method runs in $O(m\cdot n{\lceil\frac{m}{2}\rceil} \log(n))$ time and is inefficient when $m \gg 1$ and $k \ll n{\lceil\frac{m}{2}\rceil}$. In this paper, we introduce the first efficient methods, both in theory and practice, for problems with $m>2$. We introduce the ``layer-ordered heap,'' a simple special class of heap with which we produce a new, fast selection algorithm on the Cartesian product. Using this new algorithm to perform $k$-selection on the Cartesian product of $m$ arrays of length $n$ has runtime $\in o(k\cdot m)$. We also provide implementations of the algorithms proposed and evaluate their performance in practice.