Papers
Topics
Authors
Recent
Search
2000 character limit reached

Distributed Persistence Domain (DPD)

Updated 4 July 2026
  • Distributed Persistence Domain (DPD) is a multi-context abstraction that, in persistent-memory systems, uses CXL switch augmentation to enforce write serialization, read-latest guarantees, and crash consistency.
  • In topological data analysis, DPDₖ(X) is defined as the collection of persistence diagrams from all k-element subsets of a finite metric space, with stability measured via quasi-isometry and Bottleneck distance.
  • A distributed persistent homology framework leverages local computations, cellular cosheaves, and collapsing spectral sequences to reconstruct global topology from filtered spaces.

Searching arXiv for the cited DPD-related papers to ground the article in the latest records. Distributed Persistence Domain (DPD) is an overloaded research term used in distinct technical senses. In computer systems, it denotes a CXL-fabric-wide persistence abstraction for disaggregated persistent memory pooling, introduced to move persistence support into CXL switches while preserving write-serialization, read-latest behavior, and crash consistency (Hadi et al., 5 Jun 2026). In topological data analysis (TDA), the closely related notation DPDk(X)\mathrm{DPD}_k(X) denotes the multiset of persistence diagrams of kk-element subsets of a finite metric space, with inverse theorems relating this representation to quasi-isometry (Solomon et al., 2021). A further distributed-persistence framework computes global persistent homology from local computations via cellular cosheaves and a Mayer–Vietoris spectral sequence (Yoon et al., 2020). The shared acronym therefore refers not to a single unified construct, but to separate abstractions developed for persistent-memory systems and for distributed persistent homology.

1. Terminological scope

The acronym DPD appears in at least three closely adjacent but technically different contexts.

Usage Object Core ingredients
Persistent memory pooling Distributed Persistence Domain CXL switches, memory endpoints, persistence invariants
Inverse-theorem TDA DPDk(X)\mathrm{DPD}_k(X) kk-subsets, persistence diagrams, dQId_{QI}, dHBd_{HB}
Distributed persistent homology DPD framework filtered spaces, cellular cosheaves, spectral sequences

In the systems usage, the central problem is that “persist operations must traverse the entire CXL fabric, including switches, links, and protocol layers, before reaching remote persistent memory,” and the proposed remedy is to extend CXL switches with persistence support (Hadi et al., 5 Jun 2026). In the TDA usage of Solomon–Wagner–Bendich, the central claim is that “the correct invariant is not the persistence diagram of XX, but rather the collection of persistence diagrams of many small subsets” (Solomon et al., 2021). In the distributed persistent homology framework of Yoon–Ghrist, the objective is distributed computation of global persistent homology from local filtered computations using “cellular cosheaves and spectral sequences” (Yoon et al., 2020).

A common source of confusion is that these constructions share vocabulary—distributed persistence, local persistence, and assembly across parts—while operating on different mathematical and systems objects. This suggests that disambiguation by field is essential: CXL persistence-domain research and TDA distributed-persistence research are not describing the same mechanism.

2. Distributed Persistence Domain in persistent memory pooling

In "Distributed Persistence Domain for Persistent Memory Pooling" (Hadi et al., 5 Jun 2026), DPD is a formal abstraction for persistence in disaggregated memory systems connected through CXL. The model assumes “a set H={h1,,hn}H=\{h_1,\ldots,h_n\} of compute hosts,” each issuing cache-line flush and fence operations toward a pool of persistent memories. The DPD participants are P=SMP=S\cup M, where S={s1,,sk}S=\{s_1,\ldots,s_k\} is the set of persistent CXL switches and kk0 is the set of memory-controller endpoints. Routing is deterministic: each packet from host kk1 to memory kk2 follows a unique path kk3.

Writes are modeled as elements kk4, each carrying an address kk5, an issuing host kk6, and a program-order sequence number kk7. Program order at a host kk8 is written kk9 iff DPDk(X)\mathrm{DPD}_k(X)0. Durability order in the DPD is defined for writes to the same address: for any DPDk(X)\mathrm{DPD}_k(X)1, a total order DPDk(X)\mathrm{DPD}_k(X)2 is required such that if DPDk(X)\mathrm{DPD}_k(X)3, then DPDk(X)\mathrm{DPD}_k(X)4, and the durable copy of DPDk(X)\mathrm{DPD}_k(X)5 is installed at least as early in the DPD participants as DPDk(X)\mathrm{DPD}_k(X)6.

The consistency invariants are explicit. (W1) Write serialization requires that for writes to the same address, host program order is respected in the DPD durability order. (W2) Read-latest requires that any read observe the value of the unique write greatest under DPDk(X)\mathrm{DPD}_k(X)7 among those that precede the read in real time. (W3) Crash-consistency requires that, after any crash, the surviving participants agree on a single, prefix-sealed sequence of writes per address consistent with DPDk(X)\mathrm{DPD}_k(X)8.

This formulation replaces the older monolithic persistence-domain assumption—“where a processor’s cache-flush and fence instructions guarantee that data are durably ordered at one co-located memory controller”—with a fabric-wide abstraction spanning switches and memory endpoints. A plausible implication is that persistence correctness can no longer be treated as a property of a single controller boundary; it must be defined over the routed fabric itself.

3. Correctness hazards and derived requirements

The motivation for DPD is that “naïvely pushing persistence into CXL switches introduces correctness hazards” (Hadi et al., 5 Jun 2026). Two hazards are identified.

The stale-read hazard arises when one host issues a write DPDk(X)\mathrm{DPD}_k(X)9 to address kk0 through a switch kk1, which persists it in a local buffer and returns an ACK before draining it to memory. A later read of kk2 from another host routed via a different switch kk3 may return an older value from the memory endpoint or from kk4’s buffer, violating the read-latest invariant (W2).

The write-reordering hazard arises when kk5 is persisted only at kk6, then a later write kk7 drains to memory first, and finally kk8 drains afterward. The memory-side order becomes kk9 then dQId_{QI}0, violating the write-serialization invariant (W1).

From these hazards, four “minimal requirements” are derived. R1) Global write-serialization requires that a younger write not overtake an older write along every path. R2) Path-wide read-forwarding requires that a read be routed or redirected to the participant holding the most recent write for that address, or force an on-the-fly drain before reply. R3) Durable coverage requires that at least one copy of each write survive in the nonvolatile state of some participant at fence-retire time. R4) Reconfiguration-safe draining requires a “drain-path” barrier when participants change, such as process migration or switch failure.

These requirements define the difference between a distributed persistence domain and “naive ‘per-switch’ persistence.” A common misconception is that nonvolatile switch buffers alone suffice to accelerate remote persistence. The hazard analysis shows that without coordination they break both read-latest and write-serialization.

4. Persistent CXL Switch architecture, recovery, and evaluation

The realization of the systems DPD is the Persistent CXL Switch (PCS) (Hadi et al., 5 Jun 2026). To meet R1–R4 “with minimal change to hosts or OS,” each switch is augmented with three blocks: a Persist Buffer Controller (PBC), a Persist Buffer (PB), and a PBC Selector (PBCS). The PB contains “32 fully-assoc entries” and stores “Data + Addr + Status tables.” The PBC contains “Request & Response FIFOs,” an “Update/Read PB Entry unit,” and a “Drain PB Entry unit.” The PBCS maintains a “Status Table mirror,” a “Request Table track in-flight,” and “PB-hit/pending lookup logic.”

The selection logic enforces R1–R2 by redirecting writes and reads to the PBC whenever the address is already present in the PB status table or request table. On WRITE→PBC, the switch inserts or overwrites the PB entry, sets status=DATA, and “immediately ACK to host (early ACK).” It then drains asynchronously, marking status through DRAIN_ISSUED, DRAIN, and FREE on downstream ACK. On READ→PBC, if the address is in the PB with status DATA or DRAIN_ISSUED, the switch returns the data immediately; otherwise it forwards to the normal path, “possibly after forcing a drain of older PBEs.” Write-coalescing occurs when a new write to the same address arrives while a DATA-state PB entry already exists; the entry is simply overwritten “without growing buffer occupancy.”

Recovery is split into processor-side and fabric-side cases. For a processor-side crash, existing host-side crash recovery such as “persistent programming library, PMDK” is assumed. If the process restarts on a new host, the OS issues a special DrainPath barrier to all memory endpoints or all switches, so that all DATA-state PB entries are pushed downstream before recovery resumes. For a switch or fabric-wide crash, a single switch reboot retains PB contents because each PCS buffers DATA-state entries in “nonvolatile memory or battery-backed SRAM.” After a multi-switch outage, however, multiple switches may each hold differing latest versions of some addresses; the fabric manager therefore issues a global DrainPath so that “memory controllers end up with the true dQId_{QI}1 prefix.”

The evaluation uses “gem5+POND CXL model” with an “8-core, 4 GHz” system, “private 32 KB L1,” “shared 256 KB L2,” “One CXL hop, 150 ns PM read, 500 ns PM write, 128 GB/s links,” and “PB size = 32 entries, 0.3 ns local lookup.” Workloads include SPLASH-4 with persistent checkpointing and YCSB on Memcached with workloads A, B, C, D, and F. The schemes are NoPB, DPD_Eager, DPD_Lazy, and Adaptive_CP. Reported results include average SPLASH-4 speedups of 32% for DPD_Eager, 33% for DPD_Lazy, and 36% for Adaptive_CP; average YCSB speedups of 31%, 34%, and 36% respectively; and “FFT sees up to 130% speedup under Adaptive_CP.” The commit-stage stall breakdown attributes “60–80% of stalls in NoPB” to clwb/mfence waits of “200–800 ns,” described as “nearly eliminated by PCS early ACKs.” Additional results report “Read-hit rates at PCS up to 20% (VOLREND), write-coalescing up to 80% (VOLREND),” and under four CXL hops, PCS schemes retain “30–40% net speedup.”

These results support the paper’s conclusion that moving persistence into the fabric can “significantly reduce persist latency, enable read forwarding, and coalesce writes, while preserving correctness and crash consistency.”

5. dQId_{QI}2 as a topological invariant of finite metric spaces

In "From Geometry to Topology: Inverse Theorems for Distributed Persistence" (Solomon et al., 2021), the distributed persistence domain is defined for a finite metric space dQId_{QI}3 and an integer dQId_{QI}4. Writing

dQId_{QI}5

and letting dQId_{QI}6 be the chosen persistence-diagram invariant, the definition is

dQId_{QI}7

A sampled subcollection dQId_{QI}8 gives an empirical DPD.

Two metrics are central. The quasi-isometry distance dQId_{QI}9 is defined via dHBd_{HB}0-quasi-isometries dHBd_{HB}1 satisfying

dHBd_{HB}2

The Hausdorff–Bottleneck distance dHBd_{HB}3 is defined on families of persistence diagrams using the Bottleneck distance dHBd_{HB}4. The principal theorem states that for every dHBd_{HB}5, the map

dHBd_{HB}6

is a global quasi-isometry from finite metric spaces with dHBd_{HB}7 into DPDs with dHBd_{HB}8, with explicit bounds

dHBd_{HB}9

The upper bound is attributed to stability, while the lower bound uses a “rounding lemma” together with “inclusion–exclusion on the Euler–characteristic curves of all subsets of sizes XX0” to descend to the two-point diagram and recover approximate pairwise distances. The constants reported are XX1, XX2, XX3, and XX4.

Sampling is treated explicitly because “in practice one never enumerates all XX5 subsets.” For the main theorem one needs only XX6, and “XX7 random subsets suffice to cover all pairs.” Computationally, “Perfect parallelism” is emphasized: each of the XX8 subsets can be sent to an individual thread or machine. For fixed small XX9, the total work

H={h1,,hn}H=\{h_1,\ldots,h_n\}0

is described as “vastly smaller” than computing persistence on the full cloud.

The experiments illustrate the interpretive claims. In the Noisy Circle example, the ordinary full-cloud diagram of the noisy circle is closer in Bottleneck distance to the pure disk, whereas the average distributed diagram “remains significantly closer to the circle.” In Gradient-Descent Alignment, after H={h1,,hn}H=\{h_1,\ldots,h_n\}1 iterations for the circle case and H={h1,,hn}H=\{h_1,\ldots,h_n\}2 for the torus case, the target cloud recovered the correct geometric shape “up to quasi-isometry.” A common misconception corrected by this work is that the full persistence diagram of H={h1,,hn}H=\{h_1,\ldots,h_n\}3 is the only relevant invariant; the paper instead argues that the distributed collection over many small subsets is the right object for inversion and robustness.

6. Distributed persistent homology by parts

"Persistence by Parts: Multiscale Feature Detection via Distributed Persistent Homology" develops a distributed-persistence framework built from filtered spaces, generalized Mayer–Vietoris sequences, cellular cosheaves, and a spectral sequence (Yoon et al., 2020). The starting point is a filtered space H={h1,,hn}H=\{h_1,\ldots,h_n\}4, where H={h1,,hn}H=\{h_1,\ldots,h_n\}5 is a continuous scalar field and H={h1,,hn}H=\{h_1,\ldots,h_n\}6. Applying homology to inclusions H={h1,,hn}H=\{h_1,\ldots,h_n\}7 yields the persistence module H={h1,,hn}H=\{h_1,\ldots,h_n\}8.

When H={h1,,hn}H=\{h_1,\ldots,h_n\}9 at every parameter P=SMP=S\cup M0, there is a short exact sequence of filtered chain complexes

P=SMP=S\cup M1

which yields a long exact sequence of persistence modules. For a finite open cover P=SMP=S\cup M2 whose nerve P=SMP=S\cup M3 is 1-dimensional, one constructs a cellular cosheaf P=SMP=S\cup M4 by assigning P=SMP=S\cup M5 to each cell P=SMP=S\cup M6 of the nerve. The associated double complex produces a spectral sequence with

P=SMP=S\cup M7

converging to P=SMP=S\cup M8.

Because the nerve is assumed to be 1-dimensional, “all higher differentials P=SMP=S\cup M9 for S={s1,,sk}S=\{s_1,\ldots,s_k\}0 vanish,” so the spectral sequence collapses at S={s1,,sk}S=\{s_1,\ldots,s_k\}1. The resulting decomposition is

S={s1,,sk}S=\{s_1,\ldots,s_k\}2

The algorithmic workflow is explicit: choose a cover and its nerve; compute local persistent homology on each patch and overlap in parallel; build the cellular cosheaves S={s1,,sk}S=\{s_1,\ldots,s_k\}3; compute cosheaf homology at each parameter step; recover global S={s1,,sk}S=\{s_1,\ldots,s_k\}4 from the collapsed spectral sequence; and finally reduce the assembled transition maps to obtain the global persistence diagram. Complexity is summarized as follows: each local persistent-homology computation on a patch S={s1,,sk}S=\{s_1,\ldots,s_k\}5 costs S={s1,,sk}S=\{s_1,\ldots,s_k\}6 in the worst case; cosheaf homology on a nerve of size S={s1,,sk}S=\{s_1,\ldots,s_k\}7 costs S={s1,,sk}S=\{s_1,\ldots,s_k\}8 at each parameter step; and because the local computations are parallel, wall-clock time is governed by S={s1,,sk}S=\{s_1,\ldots,s_k\}9 plus the smaller cosheaf overhead.

The application emphasized is multiscale feature detection. Taking kk00 as an estimate of sampling density on a point cloud, one can choose a cover by density thresholds such as “sparse” versus “dense” regions. The method then “produces barcodes annotated by which density region they come from, rescuing small but tightly-sampled loops that standard PH would classify as noise.” This suggests a different sense of distribution from kk01: not many kk02-subsets of a metric space, but many local filtered pieces assembled by sheaf-theoretic and spectral-sequence machinery.

7. Conceptual relations and distinctions

The three usages share a family resemblance: each replaces a single global object with a distributed structure whose correctness or interpretability depends on controlled recombination. In the CXL setting, the global object is the traditional centralized persistence domain, and DPD distributes durable state across switches and memory endpoints. In kk03, the global object is the full persistence description of a point cloud, replaced by the multiset of diagrams of many small subsets. In distributed persistent homology by parts, the global object is the persistent homology of a filtered space, reconstructed from local computations on a cover.

The technical meanings, however, remain distinct. The systems DPD is defined by host program order, routing paths, durability order, read-latest semantics, and crash recovery; its central concerns are ACK timing, stale reads, write serialization, and DrainPath barriers (Hadi et al., 5 Jun 2026). The inverse-theorem DPD is defined by subset sampling, persistence diagrams, Bottleneck distance, and quasi-isometry bounds (Solomon et al., 2021). The by-parts framework is defined by filtered covers, cosheaf homology, and a collapsing spectral sequence (Yoon et al., 2020).

Accordingly, “Distributed Persistence Domain” should be read contextually. In persistent-memory systems, it names a correctness abstraction for pooled remote persistence. In TDA, it names either a multiset-valued invariant on subsets or a distributed assembly strategy for persistent homology. The shared acronym reflects methodological convergence around decomposition and reconstruction, but not a single cross-domain theory.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Distributed Persistence Domain (DPD).