Papers
Topics
Authors
Recent
2000 character limit reached

A Communication-Efficient Distributed Data Structure for Top-k and k-Select Queries

Published 21 Sep 2017 in cs.DS | (1709.07259v1)

Abstract: We consider the scenario of $n$ sensor nodes observing streams of data. The nodes are connected to a central server whose task it is to compute some function over all data items observed by the nodes. In our case, there exists a total order on the data items observed by the nodes. Our goal is to compute the $k$ currently lowest observed values or a value with rank in $[(1-\varepsilon)k,(1+\varepsilon)k]$ with probability $(1-\delta)$. We propose solutions for these problems in an extension of the distributed monitoring model where the server can send broadcast messages to all nodes for unit cost. We want to minimize communication over multiple time steps where there are $m$ updates to a node's value in between queries. The result is composed of two main parts, which each may be of independent interest: (1) Protocols which answer Top-k and k-Select queries. These protocols are memoryless in the sense that they gather all information at the time of the request. (2) A dynamic data structure which tracks for every $k$ an element close to $k$. We describe how to combine the two parts to receive a protocol answering the stated queries over multiple time steps. Overall, for Top-$k$ queries we use $O(k + \log m + \log \log n)$ and for $k$-Select queries $O(\frac{1}{\varepsilon2} \log \frac{1}{\delta} + \log m + \log2 \log n)$ messages in expectation. These results are shown to be asymptotically tight if $m$ is not too small.

Citations (1)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.