Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DisTRaC: Accelerating High Performance Compute Processing for Temporary Data Storage (2212.03054v1)

Published 6 Dec 2022 in cs.DC and cs.OS

Abstract: High Performance Compute (HPC) clusters often produce intermediate files as part of code execution and message passing is not always possible to supply data to these cluster jobs. In these cases, I/O goes back to central distributed storage to allow cross node data sharing. These systems are often high performance and characterised by their high cost per TB and sensitivity to workload type such as being tuned to small or large file I/O. However, compute nodes often have large amounts of RAM, so when dealing with intermediate files where longevity or reliability of the system is not as important, local RAM disks can be used to obtain performance benefits. In this paper we show how this problem was tackled by creating a RAM block that could interact with the object storage system Ceph, as well as creating a deployment tool to deploy Ceph on HPC infrastructure effectively. This work resulted in a system that was more performant than the central high performance distributed storage system used at Diamond reducing I/O overhead and processing time for Savu, a tomography data processing application, by 81.04% and 8.32% respectively.

Summary

We haven't generated a summary for this paper yet.