Kubernetes Network Drivers (KNDs)

Updated 3 July 2025

Kubernetes Network Drivers (KNDs) are a modular, declarative system that replaces legacy CNI models with precise network resource orchestration.
They integrate advanced components like DRA, NRI, and OCI enhancements to enable topology-aware placement and optimized driver coordination.
Implementations such as DraNet demonstrate improved performance metrics and deterministic resource allocation for high-throughput AI/ML and Telco applications.

Kubernetes Network Drivers (KNDs) define a modular and declarative architecture for high-performance network resource management within Kubernetes, addressing the shortcomings of legacy Container Network Interface (CNI) plugin models. The KND model integrates advanced hardware awareness, composable driver logic, and standardized declarative resource allocation, directly supporting high-throughput, low-latency workloads such as AI/ML and Telco applications. Through the introduction of mechanisms such as Dynamic Resource Allocation (DRA), Node Resource Interface (NRI), and enhancements to the OCI runtime specification, KNDs establish a foundation for a flexible, efficient, and scalable Kubernetes networking ecosystem.

1. Architectural Design: KNDs vs. Legacy Kubernetes Networking

Traditional Kubernetes networking is centered around the CNI specification, wherein plugins (such as Calico and Cilium) are invoked imperatively during pod creation without coordination with scheduler-level topology or advanced resource allocation policies. Plugins tend to be either "thick" (monolithic, supporting most features internally) or composed through meta-plugins (e.g., Multus), leading to operational complexity, lack of qualitative resource expressiveness, and brittle race-prone bootstrapping.

The Kubernetes Network Driver model introduces a composable and declarative framework by elevating network resource management into Kubernetes’ core. In KND, drivers independently discover, publish, and allocate network interfaces, with user requests expressed through Kubernetes-native APIs. This enables precise description and orchestration of network resources according to hardware, topology, or workload requirements. KNDs benefit from tight scheduler integration, facilitating qualitative selection (e.g., placement based on PCI/domain topology) and optimized resource alignment (such as affinitizing an RDMA NIC to a GPU on the same PCI root).

The primary distinctions are:

Declarative resource claims replace imperative plugin executions.
Hardware/topology awareness allows affinity, locality, and constraint-driven placement.
Composable, independent drivers minimize complexity and operational coupling.
Native integration with Kubernetes APIs (DRA/NRI) facilitates richer resource orchestration.

2. Component Stack: DRA, NRI, and OCI Enhancements

The KND model is realized through the following mechanisms:

Dynamic Resource Allocation (DRA):
- DRA supersedes the device plugin model, supporting the publication and allocation of network resources with rich, structured metadata. KNDs leverage the ResourceClaim API, where users can specify needs with selectors using expressive logic (including Common Expression Language, CEL). Topology-aware scheduling is enabled by integrating hardware attributes (such as PCIe root, NUMA node, and interface type) at the resource advertisement phase.
- Latency- or affinity-sensitive workloads can claim not just "any RDMA NIC" but a device with a specified NUMA or PCI alignment.
Node Resource Interface (NRI):
- NRI introduces event-driven extension points in the container runtime. Independent drivers register to receive lifecycle events (e.g., RunPodSandbox, CreateContainer), enabling privilege-isolated, stage-specific actions such as assigning network interfaces or device nodes into pods/containers.
OCI (Open Container Initiative) Runtime Specification Enhancements:
- The OCI spec has been revised to allow the declarative specification of network interfaces for pod namespaces, reducing the need for custom privilege escalation within network drivers. The standard runtime now performs privileged operations, while KNDs focus on higher-level orchestration.

This composable stack enables concurrent, parallel, and isolated operation of diverse drivers (e.g., for accelerators, storage, networking), further supporting the separation of concerns and robustness.

3. DraNet: Reference Implementation and Topology-Aware AI/ML Networking

DraNet is an open-source reference KND implementation providing dynamic, declarative management of network interfaces, including RDMA, as first-class Kubernetes resources. DraNet discovers node interfaces and publishes per-interface metadata as ResourceSlices, enabling sophisticated selectors in ResourceClaims.

For high-performance distributed AI/ML workloads, DraNet eliminates the randomness of the "hardware lottery" (arising from decoupled device plugin assignment) by orchestrating GPU and RDMA NIC co-location at the PCIe and NUMA hierarchy level. A pod can, for example, request "an RDMA NIC on the same PCI root as GPU 0," ensuring optimal data paths and minimal intra-node latency.

DraNet operates entirely within the standard Kubernetes workflow, requiring no device plugin/CNI chaining, annotation hacks, or out-of-band synchronization scripts.

4. Performance Characteristics and Empirical Findings

Quantitative benchmarks in the paper demonstrate the practical benefits of the KND model and DraNet:

Pod Startup Latency: DraNet achieves median (P50) pod startup times of 1.8 s, with 99th percentile under 2.3 s, surpassing legacy VM-based or multi-plugin approaches—crucial for dynamic, autoscaled, or ephemeral workloads.
Distributed All-Gather/All-Reduce (AI/ML Workloads):
- With optimal GPU-RDMA alignment, DraNet delivers throughput of 46.59 GB/s (All-Gather, 8GB message), a ~60% improvement over unaligned (legacy device plugin) allocations, and with significantly reduced variance.
Variance: Deterministic, reproducible placement under KND/DraNet eliminates multi-second performance swings, addressing operational unpredictability in large-scale training.

$\begin{array}{lccc} \text{Message Size} & \text{Aligned}~(\overline{x} \pm \sigma) & \text{Unaligned}~(\overline{x} \pm \sigma) \ 64~\text{KB} & 1.29~(\pm 0.02) & 1.16~(\pm 0.06) \ 1~\text{MB} & 11.42~(\pm 0.19) & 8.98~(\pm 0.95) \ 8~\text{GB} & 46.59~(\pm 0.03) & 29.20~(\pm 5.62) \ \end{array}$

This suggests that topology-aware resource allocation yields both absolute performance gains and predictable, stable application-level behavior.

5. Applications in Telco, Edge, and Future Networking

The KND architecture addresses a critical requirement for future Telco applications, including Network Functions Virtualization (NFV) and 5G/edge deployments. Carrier-grade scenarios demand deterministic placement and isolation of vCPUs, NICs, and memory within the same NUMA domain to minimize jitter and maximize SLA compliance.

KNDs enable declarative, policy-driven resource slicing (e.g., BGP-routed VNF slices, SRv6 tunnels, DPDK-enabled interfaces) directly in Kubernetes. This supports rapid, scalable, and topology-sensitive onboarding of virtualized network functions, meeting operational needs for modern Telco, RAN, and edge environments.

Beyond Telco and AI/ML, the KND model is equipped to support an expanding "galaxy" of drivers, each specializing in different hardware, protocols, or operational domains. This fundamental extensibility sets a foundation for an ecosystem of composable, reusable network drivers in Kubernetes.

6. Technical Details and Illustrative Specifications

Resource Discovery and Claiming: DraNet publishes network interfaces as ResourceSlices, each annotated with attributes.

# Example ResourceClaim requesting an aligned RDMA NIC
apiVersion: resource.k8s.io/v1alpha1
kind: ResourceClaim
spec:
  selector:
    pci_root: "0000:86:00.0"

Declarative Network Attachment (OCI Spec):

DraNet fills OCI_Spec.interfaces[] with:

{
  "name":   "rdma0",
  "type":   "ethernet",
  "pciAddress": "0000:86:00.0"
}

Allocation Logic:

$\text{Select node}\ n\ \text{where }\exists\,g \in \mathit{GPUs}_n,\, r \in \mathit{NICs}_n~:~ \text{PCI}(g) = \text{PCI}(r)$

Pod Startup Time Percentiles:

$\begin{array}{lcc} \text{Percentile} & \text{Startup Latency (s)} \ \text{P50} & 1.8 \ \text{P90} & 2.1 \ \text{P99} & 2.3 \ \end{array}$

7. Broader Impact and Future Directions

The KND approach is intentionally designed to stimulate a modular and declarative ecosystem, allowing specialization per data plane, hardware accelerator, or protocol (e.g., SROV, DPDK, MPLS, VPN, secure enclave). This is facilitated by a standardized API surface (as advanced in KEP-4817), ensuring interoperability and a federated understanding across driver implementations.

Operational simplicity is a central outcome: operators can reason about, audit, and upgrade the networking substrate using standard Kubernetes objects rather than ad hoc scripting and imperative hooks.

A plausible implication is that, as the KND model matures and is widely adopted, both researchers and practitioners may shift towards a more compositional methodology in deploying, tuning, and evolving network stacks within Kubernetes at scale. The result is a platform capable of accommodating future hardware and software advances with minimal friction, thus supporting both current and next-generation distributed and high-performance applications.

PDF Markdown Chat (Upgrade)