- The paper systematizes a wide range of techniques for scaling BFT consensus by categorizing methods such as optimized communication topologies, pipelining, and cryptographic enhancements.
- It details how approaches like gossip, tree, and star-based networks mitigate quadratic message overhead and alleviate leader bottlenecks in blockchain systems.
- The study emphasizes challenges in practical scalability, advocating for standardized evaluation frameworks to compare performance under diverse network conditions.
The paper provides a comprehensive systematization of recent techniques aimed at improving the scalability of Byzantine Fault Tolerant (BFT Byzantine Fault Tolerant) consensus protocols, with a particular focus on blockchain infrastructures. It methodically reviews and categorizes a broad spectrum of approaches that have emerged to address fundamental performance bottlenecks when moving from small networks—as classical BFT protocols like PBFT demonstrate—to large-scale, geographically dispersed environments.
The work begins by outlining the motivation behind scalable BFT consensus: while traditional BFT protocols (exemplified by PBFT and its variants) can achieve high throughput in small systems (up to the magnitude of 10⁵ transactions per second), their intrinsic use of O(n²) message exchanges and centralized leader designs impedes performance when hundreds or thousands of nodes participate. In contrast, modern blockchain systems demand protocols that not only provide consensus finality at low energy cost but also maintain efficiency under open membership and dynamic network conditions.
A systematic literature review was conducted using automated search tools with dual sorting strategies (by relevance and recency) to gather works from both highly cited and recent contributions. Fifty-two research papers were selected based on rigorous inclusion and exclusion criteria to ensure that each contribution presented either a novel scalability technique or a meaningful combination of established methods that improved beyond classic protocols.
The paper categorizes scalability-enhancing techniques into several interrelated domains:
- Communication Topologies and Strategies:
The survey details approaches that modify the network topology to reduce communication overhead. In contrast to the all‐to‐all (clique) communications of legacy protocols, strategies such as star-based (as in HotStuff), tree-based (as seen in Kauri and ByzCoin), and gossip-based or randomized overlay networks (employed by protocols like Gosig and Avalanche) are investigated. These techniques focus on alleviating leader bottlenecks and distributing load more evenly across nodes, although trade-offs such as increased latency from hierarchical dissemination are also examined.
- Pipelining:
- Out-of-Order Processing, as in PBFT, where instances do not have to complete strictly sequentially.
- Chain-Based Pipelining, such as in HotStuff, where quorum certificates can be reused to certify incremental protocol stages, effectively “stretching” the pipeline.
- Multiplexing Consensus Instances, as exemplified by Kauri, which decouples the number of concurrently running instances from the rigid number of protocol rounds by introducing a configurable stretch factor.
- Additionally, the work discusses the challenges and merits of pipelining in multi-leader and leaderless protocols, emphasizing that careful coordination is required to avoid conflicts while mitigating the inherent bottlenecks of centralized leaders.
- Cryptographic Primitives:
- Multi-Signatures: The use of BLS-based non-interactive signatures allows aggregation of votes, thereby reducing both transmission and storage costs.
- Threshold Signatures: These further reduce the aggregation cost by requiring only a subset of signature shares to form a valid quorum certificate.
- Secret Sharing and Erasure Coding: Techniques like those in FastBFT leverage hardware-based trusted compartments to securely distribute one-time secrets, while erasure coding is used to tolerate packet loss and lower leader loads during reliable broadcast.
- Verifiable Random Functions (VRF): Employed in protocols such as Algorand, VRFs enable non-interactive committee selection (cryptographic sortition) that maintains unpredictability of committee membership, thereby thwarting targeted adversarial attacks.
- Independent Groups:
The analysis distinguishes between sharding and hierarchical consensus mechanisms. Sharding partitions the entire network into independent subsets (or shards) so that transactions that pertain only to one shard can be processed in parallel without global coordination. Hierarchical consensus, on the other hand, introduces layers (with “representative” nodes) that coordinate among lower-level groups. Both approaches mitigate the quadratic communication overhead by reducing the number of nodes involved in any given consensus instance, though they introduce new challenges related to cross-shard transaction ordering and inter-group consistency.
- Consensus Committee Selection:
Recognizing that having all nodes participate in every consensus instance is often unnecessary and inefficient, the paper surveys methods to form smaller committees that perform the core consensus work. Randomized sampling (often secured via VRFs) is central to many committee-based designs (e.g., Algorand and RapidChain), where only a randomly determined subset of nodes actively participates in the protocol, while the remainder functions in an observer role. Such schemes effectively balance scalability, safety, and liveness, though attention must be paid to adaptive adversary models and the frequency with which committees are reformed.
Finally, the survey examines the role of trusted execution environments (TEEs), such as Intel’s Software Guard Extensions (SGX), in enhancing BFT scalability. TEEs can provide guarantees on counter monotonicity and mitigate equivocation—reducing the typical replica count from 3f+1 to 2f+1—and enable efficient aggregation of message signatures. Though promising, the incorporation of TEEs remains less explored compared to purely algorithmic improvements, thus presenting an open avenue for future work.
The paper concludes by discussing open challenges. While a broad array of techniques has been applied, there is no common evaluation framework for comparing these methods across diverse network settings and adversarial models. In addition, the complexity of combining these techniques—not only in terms of theoretical guarantees but also in practical implementation and resource requirements—remains an active research area. The survey underlines the need for further systematic evaluations to provide guidelines for selecting appropriate protocols based on application-specific requirements.
Overall, by presenting a detailed taxonomy and critical comparison of the design space for scalable BFT consensus protocols, the paper serves as an authoritative resource for researchers seeking to understand and further advance the state of scalable consensus in blockchain systems.