Lessons from the Congested Clique Applied to MapReduce (1405.4356v2)
Abstract: The main results of this paper are (I) a simulation algorithm which, under quite general constraints, transforms algorithms running on the Congested Clique into algorithms running in the MapReduce model, and (II) a distributed $O(\Delta)$-coloring algorithm running on the Congested Clique which has an expected running time of (i) $O(1)$ rounds, if $\Delta \geq \Theta(\log4 n)$; and (ii) $O(\log \log n)$ rounds otherwise. Applying the simulation theorem to the Congested-Clique $O(\Delta)$-coloring algorithm yields an $O(1)$-round $O(\Delta)$-coloring algorithm in the MapReduce model. Our simulation algorithm illustrates a natural correspondence between per-node bandwidth in the Congested Clique model and memory per machine in the MapReduce model. In the Congested Clique (and more generally, any network in the $\mathcal{CONGEST}$ model), the major impediment to constructing fast algorithms is the $O(\log n)$ restriction on message sizes. Similarly, in the MapReduce model, the combined restrictions on memory per machine and total system memory have a dominant effect on algorithm design. In showing a fairly general simulation algorithm, we highlight the similarities and differences between these models.