Papers
Topics
Authors
Recent
Search
2000 character limit reached

Carbide: Highly Reliable Networks Through Real-Time Multiple Control Plane Composition

Published 22 May 2020 in cs.NI | (2005.11425v2)

Abstract: Achieving highly reliable networks is essential for network operators to ensure proper packet delivery in the event of software errors or hardware failures. Networks must ensure reachability and routing correctness, such as subnet isolation and waypoint traversal. Existing work in network verification relies on centralized computation at the cost of fault tolerance, while other approaches either build an over-engineered, complex control plane, or compose multiple control planes without providing any guarantee on correctness. This paper presents Carbide, a novel system to achieve high reliability in networks through distributed verification and multiple control plane composition. The core of Carbide is a simple, generic, efficient distributed verification framework that transforms a generic network verification problem to a reachability verification problem on a directed acyclic graph (DAG), and solves the latter via an efficient distributed verification protocol (DV-protocol). Equipped with verification results, Carbide allows the systematic composition of multiple control planes and realization of operator-specified consistency. Carbide is fully implemented. Extensive experiments show that (1) Carbide reduces downtime by 43% over the most reliable individual underlying control plane, while enforcing correctness requirements on all traffic; and (2) by systematically decomposing computation to devices and pruning unnecessary messaging between devices during verification, Carbide scales to a production data center network.

Summary

  • The paper introduces a novel method that uses real-time multiple control plane composition with the CPCheck framework to ensure high network reliability.
  • It employs a flexible CPSpec grammar and a robust DV-protocol to dynamically assign control responsibilities while solving complex reachability verifications.
  • Rigorous evaluations on backbone and data center networks reveal that Carbide scales efficiently with minimal overhead and outperforms centralized verification approaches.

Carbide: Highly Reliable Networks Through Real-Time Multiple Control Plane Composition

Carbide presents a sophisticated system designed to achieve elevated network reliability through the use of real-time multiple control plane composition. This paper explores how Carbide leverages distributed verification to manage complex network requirements efficiently while ensuring minimal downtime and optimal network performance.

Introduction and Background

The demand for higher network reliability has surged alongside the increase in business-critical applications, leading to costly implications of network downtime. Traditional approaches tend to centralize verification, which ironically introduces a single point of failure and often results in bottlenecks, hampering real-time reaction to network changes. Carbide aims to circumvent these limitations using a distributed architecture.

Architecture and Key Components

Carbide fundamentally integrates distributed verification with multiple control plane composition to enhance network reliability:

  • CPCheck Framework: At the core of Carbide is the CPCheck distributed verification framework. It converts general network verification requirements into reachability verification problems on a Directed Acyclic Graph (DAG), implementing a robust distributed verification protocol (DV-protocol). Figure 1

    Figure 1: Carbide architecture.

  • Control Planes: Multiple control plane instances run concurrently, both centralized (e.g., SDN) and distributed (e.g., OSPF). Through CPCheck, Carbide verifies which packet spaces can be safely managed by each CP without violating specified requirements.
  • Flexible Grammar—CPSpec: Operators utilize CPSpec to specify correctness and consistency requirements for control planes over different packet spaces. CPComposer then uses these specifications to assign control responsibilities dynamically among the control planes.

Distributed Verification Process

Carbide's distributed verification process involves transforming a generic network requirement into a DV-Network verification problem, addressing issues such as waypoint routing and subnet isolation within DAGs: Figure 2

Figure 2: DV-Network for general topology and waypoint requirement.

  • Transformation of Generic Networks: By using product graphs, Carbide effectively transforms diverse verification problems to DV-Network verification problems, accommodating a wide range of network paths and constraints.
  • Robust to Failures: The DV-protocol employed by Carbide is resilient to network partitioning and operates independently of the control plane, enabling real-time verification of network correctness.
  • Extensions: Handling packet modifications, multicast, anycast, link-state check, and even enforcing conditional or coverage requirements further exemplifies Carbide's flexibility and robustness.

Performance and Evaluation

Carbide's implementation was rigorously evaluated to demonstrate its capabilities: Figure 3

Figure 3

Figure 3: Average memory consumption per device and total message after a FIB update. (a) Backbone (b) Data center.

  • Efficiency: The evaluation on diverse topologies, including backbone and data center networks, showed that Carbide considerably reduced average network downtime compared to traditional SDN and OSPF implementations.
  • Scalability: Carbide scaled effectively, maintaining operational efficiency in large-scale network environments with minimal overhead in memory and messaging, outperforming leading centralized verification frameworks. Figure 4

Figure 4

Figure 4: Packet receiving rate for the fast recovery experiments. The failure in (a) affects both the SDN and OSPF CPs, while that in (b) affects only the current SDN.

Implications and Future Directions

Carbide’s methodology of integrating multiple control planes through distributed verification presents significant implications for both practical network management and future innovations in network verification technologies. It offers a pathway toward more adaptive and resilient network infrastructures that can dynamically react to configuration changes and potential failures without compromising on speed or reliability. Future advancements in AI-driven network management systems could leverage Carbide's architecture for more intelligent, automated decision-making processes.

Conclusion

The Carbide system sets a new paradigm in network management by efficiently composing multiple control planes with comprehensive real-time verification, thus ensuring high reliability and consistent network operation. This approach addresses the complexities associated with network verification and correct configuration enforcement, proving its potential as a cornerstone technology for future scalable, resilient network systems.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.