Papers
Topics
Authors
Recent
2000 character limit reached

Regular Octopus CXL Topologies

Updated 12 December 2025
  • Regular Octopus topologies are defined by a rigorous combinatorial structure as biregular, diameter-two bipartite graphs equivalent to a 2-(n, d, 1) BIBD.
  • They enable each host pair to share memory through a unique multi-headed CXL device, ensuring uniform single-hop latency and balanced bandwidth.
  • The design balances pod size, device port count, and cost, offering a scalable and cost-effective solution for composable memory systems.

A regular Octopus topology defines a class of scalable, low-cost Compute Express Link (CXL) memory pooling networks distinguished by rigorous combinatorial structure and explicit performance-versus-cost trade-offs. Formally, a regular Octopus topology is a biregular, diameter-two, λ=1\lambda=1 bipartite graph G=(HP,E)G = (H \cup P, E), directly corresponding to a 2-(n,d,1)(n, d, 1) balanced incomplete block design (BIBD) with tightly constrained parameters. This configuration enables each pair of hosts to share memory through a unique multi-headed CXL device (MHD), without necessitating full all-to-all connectivity or reliance on expensive, high-port switches, thereby achieving memory pooling efficiency comparable to conventional architectures but at substantially reduced cost and complexity (Berger et al., 15 Jan 2025).

1. Graph-Theoretic Model

The CXL pod is formalized as a bipartite graph

G=(HP,E),H=n,P=m,EH×P,G = (\,H \,\cup\, P,\, E\,),\qquad |H| = n,\quad |P| = m,\quad E\subseteq H \times P,

where HH denotes the set of nn hosts (servers), PP the set of mm pools (MHDs), and EE the host–pool connectivity edges (physical CXL links). In this framework, each node in HH interfaces only with pool nodes in PP and vice versa; there are no intra-set links. This representation encodes the strict regularity and sharing constraints endemic to Octopus topologies.

2. Biregularity and Key Constraints

A regular Octopus enforces degree regularity on both sides: hH:  degG(h)=k,pP:  degG(p)=d,\forall h\in H:\;\mathrm{deg}_G(h)=k,\qquad \forall p\in P:\;\mathrm{deg}_G(p)=d, meaning every host connects to exactly kk MHDs and every MHD is attached to exactly dd hosts. Additionally, the critical λ=1\lambda=1 property requires that

hihjH:{pP:  (hi,p)E    (hj,p)E}  =  1,\forall\,h_i\neq h_j\in H:\quad |\,\{\,p\in P:\;(h_i,p)\in E\;\wedge\;(h_j,p)\in E\}\,| \;=\; 1,

ensuring every distinct pair of hosts shares exactly one common pool. This uniquely positions each host-pair at distance two in GG, with a single mutual rendezvous device, and prohibits congestion or ambiguity in mediation paths.

3. Construction and Balanced Incomplete Block Designs

The construction of regular Octopus topologies leverages the theory of balanced incomplete block designs (BIBDs). Specifically, an Octopus configuration with parameters (n,m,k,d,λ=1)(n, m, k, d, \lambda=1) transforms into a 2-(v=n,k=d,λ=1)(v=n, k=d, \lambda=1) BIBD, treating hosts as “treatments” and pools as “blocks” of size dd. The admissible parameter sets are governed by characteristic BIBD identities: nk=md,n=1+k(d1),m=nkd=k(1+k(d1))d,n k = m d,\qquad n = 1 + k(d-1),\qquad m = \frac{n k}{d} = \frac{k(1 + k(d-1))}{d}, with divisibility conditions:

  • $1< d < n$
  • n10 (mod d1)n-1 \equiv 0\ (\mathrm{mod}\ d-1)
  • nk0 (mod d)n k \equiv 0\ (\mathrm{mod}\ d)

Classical existence results and explicit combinatorial constructions (e.g., via projective planes or difference sets) provide infinite families and practical construction recipes under these arithmetic constraints.

Parameter Role in Topology BIBD Interpretation
nn # of hosts # of “treatments”
mm # of MHDs/pools # of “blocks”
kk Host degree # of blocks per treatment
dd Pool degree Block size

4. Performance, Connectivity, and Pooling Semantics

The λ=1\lambda=1 condition imposes that the associated bipartite graph has diameter two: any two hosts connect via a unique shared pool. Consequently, the maximum communication path length (excluding intra-host/inter-pool aspects) is two hops. This yields several direct properties:

  • Pooling semantics: Each host-pair can directly collaborate or share memory via their sole common MHD, which supports uniform single-hop (distance-two) latency for critical shuffle or 1:1 messaging patterns.
  • Bandwidth balance: Each host’s kk links allow even memory interleaving across its assigned MHDs, and regular port distribution ensures no network hotspots.
  • Memory allocation: The topology’s structure enables straightforward stripe-based interleaving and compute-to-memory mapping with load balance.

5. Cost-Benefit Trade-Offs and Resource Scaling

The principal trade-off in regular Octopus architectures is between pod size (nn), device port count (dd), and per-host cost. Cost per host is proportional to (k/d)cost_of_one_MHD(k/d)\cdot\text{cost\_of\_one\_MHD}, and the topology allows amortization over many inexpensive, small-port MHDs. Increasing dd (larger-port MHDs) raises pod capacity per

n=1+k(d1),n = 1 + k(d-1),

but with rising device and intrinsic latency costs; increasing kk improves host connectivity but imposes greater network interface cost per host. Optimal selection of (k,d)(k, d) places the design on the cost–pod-size–latency Pareto frontier. This suggests design flexibility for datacenter operators to closely tailor deployments to application-level and economic constraints.

6. Examples and Existence Guarantees

Canonical infinite families of regular Octopus networks include:

  • Finite projective planes of order qq, yielding n=q2+q+1,d=q+1,k=q+1,m=q2+q+1n = q^2+q+1, d = q+1, k = q+1, m = q^2+q+1.
  • Cyclic difference sets, enabling cyclic BIBDs for many (n,d)(n, d).
  • Wilson’s theorems, which guarantee the existence of solutions for sufficiently large nn matching the divisibility criteria.

These constructions allow selection of practical parameters for real-world CXL pods and guide both hardware procurement and topology planning.

7. Practical Implementation and Empirical Results

Simulation with realistic production traces demonstrates that Octopus topologies achieve memory savings on par with more expensive, fully-pooled designs. Hardware evaluation confirms that Octopus configurations reduce RPC latency by a factor of 3×3\times relative to RDMA (Berger et al., 15 Jan 2025). The formal structure and allocation algorithms developed for these graphs underpin robust, production-ready CXL pooling fabrics, enabling cost-effective scaling without performance compromise.

In summary, regular Octopus topologies offer a mathematically rigorous framework for designing uniform-latency, low-cost, diameter-two CXL pooling fabrics. Their equivalence to special BIBDs ensures concrete existence/falsifiability criteria for design parameters, guiding efficient hardware realization and scalable deployment for composable memory systems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Regular Octopus Topologies.