A Pliable Index Coding Approach to Data Shuffling (1701.05540v3)
Abstract: A promising research area that has recently emerged, is on how to use index coding to improve the communication efficiency in distributed computing systems, especially for data shuffling in iterative computations. In this paper, we posit that pliable index coding can offer a more efficient framework for data shuffling, as it can better leverage the many possible shuffling choices to reduce the number of transmissions. We theoretically analyze pliable index coding under data shuffling constraints, and design a hierarchical data-shuffling scheme that uses pliable coding as a component. We find benefits up to $O(ns/m)$ over index coding, where $ns/m$ is the average number of workers caching a message, and $m$, $n$, and $s$ are the numbers of messages, workers, and cache size, respectively.