How Hard is Asynchronous Weight Reassignment? (Extended Version) (2306.03185v2)
Abstract: The performance of distributed storage systems deployed on wide-area networks can be improved using weighted (majority) quorum systems instead of their regular variants due to the heterogeneous performance of the nodes. A significant limitation of weighted majority quorum systems lies in their dependence on static weights, which are inappropriate for systems subject to the dynamic nature of networked environments. To overcome this limitation, such quorum systems require mechanisms for reassigning weights over time according to the performance variations. We study the problem of node weight reassignment in asynchronous systems with a static set of servers and static fault threshold. We prove that solving such a problem is as hard as solving consensus, i.e., it cannot be implemented in asynchronous failure-prone distributed systems. This result is somewhat counter-intuitive, given the recent results showing that two related problems -- replica set reconfiguration and asset transfer -- can be solved in asynchronous systems. Inspired by these problems, we present two versions of the problem that contain restrictions on the weights of servers and the way they are reassigned. We propose a protocol to implement one of the restricted problems in asynchronous systems. As a case study, we construct a dynamic-weighted atomic storage based on such a protocol. We also discuss the relationship between weight reassignment and asset transfer problems and compare our dynamic-weighted atomic storage with reconfigurable atomic storage.
- M. Naor and A. Wool, “The load, capacity, and availability of quorum systems,” SIAM Journal on Computing, vol. 27, no. 2, 1998.
- D. Agrawal and A. El Abbadi, “The tree quorum protocol: An efficient approach for managing replicated data.” in 16th International Conference on Very Large Data Bases, 1990.
- “Etcd,” https://github.com/etcd-io/etcd, accessed: 2022-07-05.
- P. Hunt, M. Konar, F. P. Junqueira, and B. Reed, “ZooKeeper: Wait-free coordination for internet-scale systems.” in USENIX annual technical conference, 2010.
- Y. Saito, S. Frølund, A. Veitch, A. Merchant, and S. Spence, “FAB: building distributed enterprise disk arrays from commodity components,” ACM SIGOPS Operating Systems Review, vol. 39, no. 5, 2004.
- F. B. Schmuck and R. L. Haskin, “GPFS: A shared-disk file system for large computing clusters,” in Proceedings of the 1st USENIX Conference on File and Storage Technologies, 2002.
- M. Whittaker, A. Charapko, J. M. Hellerstein, H. Howard, and I. Stoica, “Read-write quorum systems made practical,” in Proceedings of the 8th Workshop on Principles and Practice of Consistency for Distributed Data, 2021.
- D. Barbara and H. Garcia-Molina, “The reliability of voting mechanisms,” IEEE Transactions on Computers, vol. 36, no. 10, 1987.
- C. Berger, H. P. Reiser, J. Sousa, and A. Neves Bessani, “AWARE: Adaptive wide-area replication for fast and resilient Byzantine consensus,” IEEE Transactions on Dependable and Secure Computing, vol. 19, no. 3, 2022.
- H. Heydari, G. Silvestre, and L. Arantes, “Efficient consensus-free weight reassignment for atomic storage,” in IEEE 20th International Symposium on Network Computing and Applications, 2021.
- R. Guerraoui, P. Kuznetsov, M. Monti, M. Pavlovič, and D.-A. Seredinschi, “The consensus number of a cryptocurrency,” in Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, 2019.
- M. K. Aguilera, I. Keidar, D. Malkhi, and A. Shraer, “Dynamic atomic storage without consensus,” Journal of the ACM, vol. 58, no. 2, 2011.
- E. Alchieri, A. Bessani, F. Greve, and J. Fraga, “Efficient and modular consensus-free reconfiguration for fault-tolerant storage,” in 21st International Conference on Principles of Distributed Systems, 2017.
- L. Jehl and H. Meling, “The case for reconfiguration without consensus: Comparing algorithms for atomic storage,” in 21st International Conference on Principles of Distributed Systems, 2017.
- L. Jehl, R. Vitenberg, and H. Meling, “SmartMerge: A new approach to reconfiguration for atomic storage,” in International Symposium on Distributed Computing, 2015.
- A. Spiegelman, I. Keidar, and D. Malkhi, “Dynamic reconfiguration: Abstraction and optimal asynchronous solution,” in International Symposium on Distributed Computing, 2017.
- J. M. Faleiro, S. Rajamani, K. Rajan, G. Ramalingam, and K. Vaswani, “Generalized lattice agreement,” in Proceedings of the 2012 ACM symposium on Principles of distributed computing, 2012.
- D. K. Gifford, “Weighted voting for replicated data,” Proceedings of the seventh ACM symposium on Operating systems, 1979.
- J. Sousa and A. Bessani, “Separating the wheat from the chaff: An empirical design for geo-replicated state machines,” in IEEE 34th Symposium on Reliable Distributed Systems, 2015.
- H. Garcia-Molina and D. Barbara, “How to assign votes in a distributed system,” Journal of the ACM, vol. 32, no. 4, 1985.
- S. Jajodia and D. Mutchler, “Dynamic voting algorithms for maintaining the consistency of a replicated database,” ACM Transactions on Database Systems, vol. 15, no. 2, 1990.
- M. J. Fischer, “The consensus problem in unreliable distributed systems (a brief survey),” in International Conference on Fundamentals of Computation Theory, 1983.
- M. Herlihy, “Wait-free synchronization,” ACM Transactions on Programming Languages and Systems, vol. 13, no. 1, 1991.
- V. Hadzilacos and S. Toueg, “A modular approach to fault-tolerant broadcasts and related problems,” Cornell University, USA, Tech. Rep. TR94-1425, 5 1994.
- H. Attiya, A. Bar-Noy, and D. Dolev, “Sharing memory robustly in message-passing systems,” Journal of the ACM, vol. 42, no. 1, 1995.
- C. Berger, L. Rodrigues, H. P. Reiser, V. Cogo, and A. Bessani, “Chasing the speed of light: Low-latency planetary-scale adaptive Byzantine consensus,” https://arxiv.org/abs/2305.15000, 2023.
- D. Davcev, “A dynamic voting scheme in distributed systems,” IEEE Transactions on Software Engineering, vol. 15, no. 1, 1989.
- L. Lamport, “On interprocess communication (part I),” Distributed Computing, vol. 1, no. 2, 1986.
- ——, “On interprocess communication (part II),” Distributed Computing, vol. 1, no. 2, 1986.
- P. Kuznetsov, T. Rieutord, and S. Tucci-Piergiovanni, “Reconfigurable Lattice Agreement and Applications,” in 23rd International Conference on Principles of Distributed Systems, 2020.
- A. Spiegelman and I. Keidar, “On liveness of dynamic storage,” in International Colloquium on Structural Information and Communication Complexity, 2017.