2000 character limit reached
Implementation-Oblivious Transparent Checkpoint-Restart for MPI (2309.14996v1)
Published 26 Sep 2023 in cs.DC
Abstract: This work presents experience with traditional use cases of checkpointing on a novel platform. A single codebase (MANA) transparently checkpoints production workloads for major available MPI implementations: "develop once, run everywhere". The new platform enables application developers to compile their application against any of the available standards-compliant MPI implementations, and test each MPI implementation according to performance or other features.