Sufficiency of Markov Process models for reproducing performance effects of fragmentation

Determine whether Markov Process-based profiles of contiguous homogeneous region interleavings, as used by the Anduril kernel module to artificially fragment physical memory on Linux, capture sufficient information to accurately reproduce end-to-end performance effects of memory fragmentation observed in production systems.

Background

The paper proposes Anduril, a Linux kernel module that uses Markov Process (MP) profiles built from /proc/kpageflags snapshots to directly reproduce spatial interleavings of memory usage classes (e.g., file cache, anonymous, huge pages) and sizes. This approach aims to create a reproducible and quantifiable fragmented state without replaying allocation histories.

While Anduril often reproduces the aggregate spatial fragmentation patterns, the authors report that it does not reliably reproduce end-to-end workload performance (runtime, throughput, tail latency, huge-page usage). Consequently, the authors explicitly state uncertainty about whether MP-based representations capture all information needed to reproduce performance effects, indicating a key unresolved methodological question about the adequacy of MP modeling for end-to-end behavior.

References

MPs may be a useful means of doing so, but it is unclear if they capture all important information to reproduce the end-to-end performance effects of fragmentation.

Characterizing Physical Memory Fragmentation  (2401.03523 - Mansi et al., 2024) in Conclusion