Controlling Data Access Load in Distributed Systems (2312.10360v1)
Abstract: Distributed systems store data objects redundantly to balance the data access load over multiple nodes. Load balancing performance depends mainly on 1) the level of storage redundancy and 2) the assignment of data objects to storage nodes. We analyze the performance implications of these design choices by considering four practical storage schemes that we refer to as clustering, cyclic, block and random design. We formulate the problem of load balancing as maintaining the load on any node below a given threshold. Regarding the level of redundancy, we find that the desired load balance can be achieved in a system of $n$ nodes only if the replication factor $d = \Omega(\log(n){1/3})$, which is a necessary condition for any storage design. For clustering and cyclic designs, $d = \Omega(\log(n))$ is necessary and sufficient. For block and random designs, $d = \Omega(\log(n))$ is sufficient but unnecessary. Whether $d = \Omega(\log(n){1/3})$ is sufficient remains open. The assignment of objects to nodes essentially determines which objects share the access capacity on each node. We refer to the number of nodes jointly shared by a set of objects as the \emph{overlap} between those objects. We find that many consistently slight overlaps between the objects (block, random) are better than few but occasionally significant overlaps (clustering, cyclic). However, when the demand is ''skewed beyond a level'' the impact of overlaps becomes the opposite. We derive our results by connecting the load-balancing problem to mathematical constructs that have been used to study other problems. For a class of storage designs containing the clustering and cyclic design, we express load balance in terms of the maximum of moving sums of i.i.d. random variables, which is known as the scan statistic. For random design, we express load balance by using the occupancy metric for random allocation with complexes.
- Jeffrey Dean. Challenges in building large-scale information retrieval systems: invited talk. In Ricardo Baeza-Yates, Paolo Boldi, Berthier A. Ribeiro-Neto, and Berkant Barla Cambazoglu, editors, Proceedings of the Second International Conference on Web Search and Web Data Mining, WSDM 2009, Barcelona, Spain, February 9-11, 2009, page 1. ACM, 2009.
- Dremel: Interactive analysis of web-scale datasets. Proc. VLDB Endow., 3(1):330–339, 2010.
- The tail at scale. Commun. ACM, 56(2):74–80, 2013.
- Reducing late-timing failure at scale: Straggler root-cause analysis in cloud datacenters. In Fast Abstracts in the 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2016.
- The hadoop distributed file system. In Mohammed G. Khatib, Xubin He, and Michael Factor, editors, IEEE 26th Symposium on Mass Storage Systems and Technologies, MSST 2012, Lake Tahoe, Nevada, USA, May 3-7, 2010, pages 1–10. IEEE Computer Society, 2010.
- Cassandra: a decentralized structured storage system. ACM SIGOPS Oper. Syst. Rev., 44(2):35–40, 2010.
- Salvatore Sanfilippo. Redis: An open source (BSD licensed), in-memory data structure store., 2023.
- Scarlett: Coping with skewed content popularity in mapreduce clusters. In Christoph M. Kirsch and Gernot Heiser, editors, European Conference on Computer Systems, Proceedings of the Sixth European Conference on Computer systems, EuroSys 2011, Salzburg, Austria, April 10-13, 2011, pages 287–300. ACM, 2011.
- ”balls into bins” - A simple and tight analysis. In Michael Luby, José D. P. Rolim, and Maria J. Serna, editors, Randomization and Approximation Techniques in Computer Science, Second International Workshop, RANDOM’98, Barcelona, Spain, October 8-10, 1998, Proceedings, volume 1518 of Lecture Notes in Computer Science, pages 159–170. Springer, 1998.
- Balanced allocations. SIAM J. Comput., 29(1):180–200, 1999.
- Balanced allocations: the heavily loaded case. In F. Frances Yao and Eugene M. Luks, editors, Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, May 21-23, 2000, Portland, OR, USA, pages 745–754. ACM, 2000.
- Balanced allocation on graphs. In Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2006, Miami, Florida, USA, January 22-26, 2006, pages 434–443. ACM Press, 2006.
- Brighten Godfrey. Balls and bins with structure: balanced allocations on hypergraphs. In Shang-Hua Teng, editor, Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2008, San Francisco, California, USA, January 20-22, 2008, pages 511–517. SIAM, 2008.
- Batch codes and their applications. In László Babai, editor, Proceedings of the 36th Annual ACM Symposium on Theory of Computing, Chicago, IL, USA, June 13-16, 2004, pages 262–271. ACM, 2004.
- Combinatorial batch codes. Adv. Math. Commun., 3(1):13–27, 2009.
- On the service capacity region of accessing erasure coded content. In 55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017, Monticello, IL, USA, October 3-6, 2017, pages 17–24. IEEE, 2017.
- Service rate region: A new aspect of coded distributed system design. IEEE Transactions on Information Theory, 67(12):7940–7963, 2021.
- Service rate region of content access from erasure coded storage. In IEEE Information Theory Workshop, ITW 2018, Guangzhou, China, November 25-29, 2018, pages 1–5. IEEE, 2018.
- Evaluating load balancing performance in distributed storage with redundancy. IEEE Transactions on Information Theory, 67(6):3623–3644, 2021.
- Scan statistics. Springer, 2001.
- Random allocations. Vh Winston, 1978.
- An unexpected stochastic dominance: Pareto distributions, catastrophes, and risk exchange. arXiv preprint arXiv:2208.08471, 2022.
- Douglas R. Stinson. Combinatorial designs - constructions and analysis. Springer, 2004.
- Nishant Suneja. Scylladb optimizes database architecture to maximize hardware performance. IEEE Software, 36(04):96–100, 2019.
- Joseph I Naus. Approximations for distributions of scan statistics. Journal of the American Statistical Association, 77(377):177–183, 1982.
- Limit laws of erdos-renyi-shepp type. The Annals of Probability, pages 1363–1386, 1987.
- Poisson approximations for r-scan processes. The Annals of Applied Probability, pages 329–357, 1992.
- Serguei Y Novak. Extreme value methods with applications to finance. CRC press, 2011.
- Valentin Fedorovich Kolchin. Random graphs. Number 53 in Encyclopedia of mathematics and its applications. Cambridge University Press, 1999.
- Michael Barot and José Antonio de la Pena. Estimating the size of a union of random subsets of fixed cardinality. Elemente der Mathematik, 56:163–169, 2001.
- On the non-asymptotic and sharp lower tail bounds of random variables. Stat, 9(1):e314, 2020.