Plane Load Balancer (PLB) Architecture

Updated 22 May 2026

Plane Load Balancer (PLB) is a mechanism that distributes network flows using stateless per-packet steering for optimal load distribution.
It leverages programmable hardware and software-defined infrastructures to achieve scalable performance, low latency, and high throughput.
PLB designs integrate real-time telemetry and control-plane feedback to dynamically balance load, ensuring fairness and efficient resource utilization.

A Plane Load Balancer (PLB) is a data and/or control plane mechanism for distributing network flows or tasks across compute nodes or servers to optimize load distribution, throughput, flow completion time, and latency. PLB-origin systems leverage programmable hardware or software-defined infrastructure (e.g., P4-programmable ASIC/FPGA switches, SDN controllers, Kubernetes clusters) to implement effective, scalable, and low-latency in-network load-balancing logic. The core of PLB methodologies is stateless per-packet load steering, with real-time (or near real-time) updates driven by application, transport, or infrastructure telemetry. The term is variably used across domains such as in-network load balancers for data centers, control plane resource distribution in virtualized/5G networks, and FPGA–accelerated edge-to-core transport architectures.

1. Architectural Patterns and Planes

PLB designs instantiate load management across physical or logical network planes and multiple architectural elements:

Data Plane (D-plane): Physical forwarding elements (hardware switches, FPGAs) responsible for line-rate traffic steering based on pre-installed tables, pipeline logic, or hardware registers. Examples include programmable data planes on PISA (Protocol-Independent Switch Architecture) targets (P4) or FPGAs (Rizzi et al., 2021, Grigoryan et al., 9 May 2025, Sheldon et al., 2023).
Control Plane (C-plane): SDN controllers, Kubernetes control API servers, or host CPUs perform higher-layer orchestration and state dissemination, such as endpoint list updates, telemetry ingestion, or overlay construction (Basu et al., 2020, Grigoryan et al., 9 May 2025).
Middle/Hypervisor Plane (H-plane): In some virtualized or multi-tenant scenarios, a middle layer multiplexes requests or manages flows that cannot be handled directly at the control plane due to latency, load, or policy constraints (Basu et al., 2020).

PLB architectures employ a separation of concerns: low-latency per-packet logic and per-connection affinity in the data plane; higher-level placements, instance scaling, or telemetry analytics in the control or hypervisor planes. Examples include Charon’s split P4 pipeline with a Verilog-based RMW table logic for per-server state (Rizzi et al., 2021), and EJ-FAT’s FPGA pipeline versus host-side epoch calendar installation (Sheldon et al., 2023).

2. Load-Balancing Algorithms and Per-Flow Consistency

PLBs implement several algorithmic primitives:

Consistent Hashing (ECMP-style): Hashing 5-tuple flow signatures to backend indices, guaranteeing per-flow affinity while distributing load evenly. P4Kube computes $h = H(\text{srcIP}, \text{dstIP}, \text{srcPort}, \text{dstPort}, \text{proto}) \bmod N$ , mapping to $N$ active backends via a CRC16 primitive (Grigoryan et al., 9 May 2025).
Power-of-2-Choices (Po2C): For arriving flows, two candidate servers are chosen via independent hash indices; the server with lower predicted load (e.g., Charon’s $g' = \max(0, g - v \cdot (Now - t))$ for observed queue length $g$ and velocity $v$ ) is selected (Rizzi et al., 2021).
Weighted Calendars: For non-flow-based UDP load (e.g., massive HPC or science instrument events), FPGA “calendar” slots are allocated proportionally to node weights derived from control-plane telemetry (Sheldon et al., 2023).
Stateless Per-Connection Consistency (PCC): Charon achieves PCC via covert channel encoding of the chosen server ID in high-order TCP timestamp bits; no per-flow state is maintained in the data plane (Rizzi et al., 2021). P4Kube relies on pure per-flow consistent hashing; connection stickiness is implicit (Grigoryan et al., 9 May 2025).

3. Telemetry, Control, and Adaptivity

Modern PLBs exploit tight control-data plane coupling for timely, precise state updates:

Passive Feedback via Protocols: Charon collects load state from server SYN-ACKs embedding queue lengths/velocity in GRE key options—no active polling is required (Rizzi et al., 2021).
Kubernetes Sidecars: P4Kube’s control-plane plugin receives Endpoints/Service events, repackages them into UDP control packets, and updates in-switch registers or via P4Runtime (Grigoryan et al., 9 May 2025).
Continuous Telemetry and Epochal Updates: EJ-FAT’s host polls per-node CPU, queue, and link metrics, computing weights $w_i \propto 1/(c_i + \epsilon)$ , reprogramming FPGA calendars as per weight deltas (Sheldon et al., 2023).
Arrival-Time Filtering: In SDN/5G, reverse path-flow mechanisms (RPFM) and earliest-deadline-first style decisions ensure latency-bounded flow steering, offloading the H-plane adaptively (Basu et al., 2020).

The data path remains stateless or per-server–indexed, while the control-plane logic is responsible for per-backend liveness, reconfiguration, or load scaling.

4. Resource Utilization, Constraints, and Scalability

PLB designs are dictated by resource and protocol constraints:

Memory Footprint: Charon’s per-server state is compressed: score tables (N × 64 B), alias tables ( $N$ entries, 4 B each), and IP tables ( $N$ entries, 8 B each) (Rizzi et al., 2021). P4Kube stores backend lists as register arrays with a static compile-time upper bound (Grigoryan et al., 9 May 2025). EJ-FAT’s FPGA calendars allocate 512 slots per epoch, leveraging BRAM for O(1) lookup and atomic event grouping (Sheldon et al., 2023).
Scaling Limits: Charon is bounded by server_id field size (typically 16–256 servers); P4Kube backends capped by compile-time constants (MAX_REPL, typically 10 in prototype; production ECMP tables scale to thousands) (Rizzi et al., 2021, Grigoryan et al., 9 May 2025). EJ-FAT’s slot count (9 LSBs of EventNumber) yields 512-way mapping resolution per epoch (Sheldon et al., 2023).
Pipeline Latency: End-to-end data-plane processing ranges from 8–12 cycles on FPGAs (EJ-FAT), $\approx 200$ ns per packet with RMW external modules (Charon), to $<1\ \mu$ s at ASIC scale (P4Kube) (Rizzi et al., 2021, Grigoryan et al., 9 May 2025, Sheldon et al., 2023).

5. Optimization Objectives and Evaluation Metrics

PLBs target multi-objective optimization under stringent network conditions:

Load Fairness: Measured by Jain’s index; Charon achieves fairness indices of $N$ 0– $N$ 1 (across loads), with ECMP baseline at $N$ 2– $N$ 3 (Rizzi et al., 2021).
Latency and Throughput: P4Kube demonstrates up to 50% improvement in average request time over NodePort or external LBs in Kubernetes (Grigoryan et al., 9 May 2025). EJ-FAT achieves fixed low pipeline latency and line-rate ( $N$ 4 at 64-byte MUP) (Sheldon et al., 2023).
End-to-End Latency (5G/SDN): MILP formulations for controller–hypervisor placement optimize for worst-case, average, and max-of-avg latency across demand sets, with up to 25 km⋅ms reduction observed and $N$ 5 reduction in H-plane load (Basu et al., 2020).
Per-Flow Consistency: Flow completion time (FCT) improvements: Charon at 99th percentile FCT demonstrates $N$ 6 reduction compared to ECMP under 92.5% load (Rizzi et al., 2021).
Update Responsiveness: Reconfiguration times include hardware calendar update ( $N$ 7 for EJ-FAT), switch state installation ( $N$ 8 for P4 data planes), and control-plane event propagation (Kubernetes Endpoints update $N$ 9 by default) (Grigoryan et al., 9 May 2025, Sheldon et al., 2023).

6. Design Insights, Limitations, and Extensibility

PLB approaches exhibit distinct operational and practical lessons:

Statelessness vs. Expressiveness: Stateless per-flow or per-event mapping eliminates scalability bottlenecks in TCAM or HBM; however, it limits fine-grained health or stickiness policies, and per-endpoint control granularity is tied to hash resolution or register limits (Rizzi et al., 2021, Sheldon et al., 2023).
Protocol Dependency: Certain schemes (e.g., Charon’s PCC) require hosts to honor TCP timestamp options (69% acceptance), or similar per-flow embedding support in QUIC/IPv6 (Rizzi et al., 2021). P4Kube’s support for TCP/UDP traffic requires static layout configuration (Grigoryan et al., 9 May 2025).
Extensibility: These systems can be adapted for multi-plane optimization (controller/hypervisor placement, hierarchical per-rack/global LBs), dynamic epochal weighting, multi-tenant slicing (VRFs/namespaces), and health-aware or load-aware telemetry feedback (Basu et al., 2020, Rizzi et al., 2021, Grigoryan et al., 9 May 2025).
Limitations: Resource-bound server counts, lack of deep per-flow logic (due to P4’s no-dynamic-loop semantics and small register count), and protocol/tooling heterogeneity across domains constrain adoption (Rizzi et al., 2021, Grigoryan et al., 9 May 2025, Sheldon et al., 2023).
Generalizability: The reverse-path offloading idea in SDN/5G control-plane PLBs (terminate request "upstream" without violating end-to-end latency) extends naturally to C-RAN, NFV orchestrators, and other three-plane architectures (Basu et al., 2020).

7. Empirical Results and Comparative Evaluation

Performance validations are reported across several testbeds:

PLB System	Hardware Platform	Key Performance	Load Capacity/Limit
Charon	P4-NetFPGA (ASIC/FPGA)	$g' = \max(0, g - v \cdot (Now - t))$ 0, <200 ns/packet	$g' = \max(0, g - v \cdot (Now - t))$ 1 (prototype, modifiable)
P4Kube	BMv2/PISA	$g' = \max(0, g - v \cdot (Now - t))$ 2s $g' = \max(0, g - v \cdot (Now - t))$ 3 L7LB	$g' = \max(0, g - v \cdot (Now - t))$ 4 (prototype)
EJ-FAT	Xilinx U280 FPGA	$g' = \max(0, g - v \cdot (Now - t))$ 5, 14.9 Mpps, 8–12 cycles	$g' = \max(0, g - v \cdot (Now - t))$ 6 calendars per epoch
SDN/vSDN [Basu et al.]	Simulation/real deployment	25 km⋅ms latency reduction, 30–60% H-plane load drop	$g' = \max(0, g - v \cdot (Now - t))$ 7 nodes

Charon and P4Kube both outperform ECMP (equal-cost multi-path) and static NodePort/LB approaches with respect to both fairness and tail latency (Rizzi et al., 2021, Grigoryan et al., 9 May 2025). EJ-FAT achieves atomic, lossless load rebalancing under sustained $g' = \max(0, g - v \cdot (Now - t))$ 8 Gb/s streaming rates (Sheldon et al., 2023). In 5G/vSDN, joint MILP placement of controllers and hypervisors, coupled with the RPFM algorithm, achieves multi-objective latency and load benefits (Basu et al., 2020).

Empirical findings indicate that PLBs deliver line-rate, scalable, and programmable load balancing suited to evolving infrastructure demands, subject to resource and protocol constraints.

Markdown Report Issue Upgrade to Chat

References (4)

Charon: Load-Aware Load-Balancing in P4 (2021)

P4Kube: In-Network Load Balancer for Kubernetes (2025)

EJ-FAT Joint ESnet JLab FPGA Accelerated Transport Load Balancer (2023)

Adaptive Control Plane Load Balancing in vSDN Enabled 5G Network (2020)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Plane Load Balancer (PLB).

Plane Load Balancer (PLB) Architecture

1. Architectural Patterns and Planes

2. Load-Balancing Algorithms and Per-Flow Consistency

3. Telemetry, Control, and Adaptivity

4. Resource Utilization, Constraints, and Scalability

5. Optimization Objectives and Evaluation Metrics

6. Design Insights, Limitations, and Extensibility

7. Empirical Results and Comparative Evaluation

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Plane Load Balancer (PLB) Architecture

1. Architectural Patterns and Planes

2. Load-Balancing Algorithms and Per-Flow Consistency

3. Telemetry, Control, and Adaptivity

4. Resource Utilization, Constraints, and Scalability

5. Optimization Objectives and Evaluation Metrics

6. Design Insights, Limitations, and Extensibility

7. Empirical Results and Comparative Evaluation

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research