Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SplitFS: Reducing Software Overhead in File Systems for Persistent Memory (1909.10123v1)

Published 23 Sep 2019 in cs.OS and cs.PF

Abstract: We present SplitFS, a file system for persistent memory (PM) that reduces software overhead significantly compared to state-of-the-art PM file systems. SplitFS presents a novel split of responsibilities between a user-space library file system and an existing kernel PM file system. The user-space library file system handles data operations by intercepting POSIX calls, memory-mapping the underlying file, and serving the read and overwrites using processor loads and stores. Metadata operations are handled by the kernel PM file system (ext4 DAX). SplitFS introduces a new primitive termed relink to efficiently support file appends and atomic data operations. SplitFS provides three consistency modes, which different applications can choose from, without interfering with each other. SplitFS reduces software overhead by up-to 4x compared to the NOVA PM file system, and 17x compared to ext4-DAX. On a number of micro-benchmarks and applications such as the LevelDB key-value store running the YCSB benchmark, SplitFS increases application performance by up to 2x compared to ext4 DAX and NOVA while providing similar consistency guarantees.

Citations (176)

Summary

  • The paper proposes SplitFS, a novel file system architecture that reduces software overhead for persistent memory by splitting data path management between user-space and the kernel.
  • SplitFS utilizes a user-space library for direct data access and optimization techniques like the 'relink' primitive, while leveraging an existing kernel PM file system for metadata.
  • Evaluations show SplitFS achieves significant performance improvements, reducing software overhead by up to 17x compared to ext4 DAX and improving throughput up to 2.7x on persistent memory.

SplitFS: A New Architecture for Reduced Software Overhead in File Systems for Persistent Memory

The paper "SplitFS: Reducing Software Overhead in File Systems for Persistent Memory" proposes an innovative approach to managing Persistent Memory (PM) in file systems. The authors introduce SplitFS, a system aiming to significantly lower software overhead and enhance the performance of applications utilizing PM. The central premise of SplitFS is based on a novel partition of responsibilities between a user-space library file system and an existing kernel PM file system, specifically ext4 DAX. This architecture strategically intercepts POSIX calls, utilizes memory mapping for file access, and implements a collection of optimizations to deliver efficiency enhancements.

Architecture and Design

SplitFS employs a dual-component architecture: a user-space library handling data operations and a kernel PM file system processing metadata operations. This setup allows direct processor interactions for data reads and writes, bypassing costly kernel traps associated with traditional file systems. By intercepting POSIX calls and using non-temporal stores, SplitFS achieves reduced latency for these operations.

An essential feature introduced in SplitFS is the "relink" primitive, utilized for efficiently managing file appends and ensuring atomicity in data operations. Relink logically moves chunks of data within PM without physical copying, leveraging ext4's journaling for atomicity guarantees. This mechanism reduces the overhead typically associated with ensuring atomic operations like data appends.

SplitFS offers three consistency modes: POSIX, sync, and strict, which cater to different application requirements. POSIX mode achieves metadata consistency with immediate synchronization of overwrites, while sync mode ensures operations are synchronous without atomicity guarantees, akin to traditional file systems like PMFS. Strict mode extends these guarantees to include atomicity for operations, useful in applications demanding strong consistency.

Performance and Evaluation

The paper presents detailed evaluations using micro-benchmarks and real-world applications such as LevelDB with YCSB and SQLite with TPC-C, which demonstrate substantial performance improvements over existing PM file systems like NOVA and ext4 DAX. SplitFS achieves up to 4x reduction in software overhead compared to NOVA and up to 17x compared to ext4 DAX, with throughput improvements up to 2.7. These benefits are especially pronounced in workloads with prevalent data operations.

SplitFS efficiently handles metadata-heavy workloads, commonly seen in applications such as git, tar, and rsync, albeit at a slight performance trade-off compared to purely in-kernel file systems like NOVA. The split architecture rationale, prioritizing data operation acceleration even if it incurs metadata processing delays, proves effective in maximizing overall application performance on PM.

Implications and Future Work

SplitFS introduces a compelling point in the spectrum of PM file-system designs. It leverages mature, well-tested ext4 DAX code while significantly optimizing data operations through user-space management. This strategy not only enhances performance but also reduces complexity by avoiding reinvention of proven system components.

The theoretical implications are substantial, indicating that architectures with user-space data handling can outperform traditional in-kernel designs under specific conditions. Practically, SplitFS could inspire further development in PM systems seeking to optimize software and hardware interactions.

Future research might explore additional optimizations and extensions to the split architecture, possibly integrating machine learning techniques to dynamically adjust consistency modes based on application behavior. Moreover, exploring compatibility and improvements across newer PM technologies and configurations would ensure continued relevance and effectiveness of SplitFS.

In conclusion, SplitFS exemplifies strategic thinking in optimizing file systems for next-generation memory technologies, ensuring both strong performance and robust reliability in handling PM.