Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
124 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

F2: Designing a Key-Value Store for Large Skewed Workloads (2305.01516v2)

Published 2 May 2023 in cs.DB

Abstract: Many real-world workloads present a challenging set of requirements: point operations requiring high throughput, working sets much larger than main memory, and natural skew in key access patterns for both reads and writes. We find that modern key-value designs are either optimized for memory-efficiency, sacrificing high-performance (LSM-tree designs), or achieve high-performance, saturating modern NVMe SSD bandwidth, at the cost of substantial memory resources or high disk wear (CPU-optimized designs). Unfortunately these designs are not able to handle meet the challenging demands of such larger-than-memory, skewed workloads. To this end, we present F2, a new key-value store that bridges this gap by combining the strengths of both approaches. F2 adopts a tiered, record-oriented architecture inspired by LSM-trees to effectively separate hot from cold records, while incorporating concurrent latch-free mechanisms from CPU-optimized engines to maximize performance on modern NVMe SSDs. To realize this design, we tackle key challenges and introduce several innovations, including new latch-free algorithms for multi-threaded log compaction and user operations (e.g., RMWs), as well as new components: a two-level hash index to reduce indexing overhead for cold records and a read-cache for serving read-hot data. Detailed experimental results show that F2 matches or outperforms existing solutions, achieving on average better throughput on memory-constrained environments compared to state-of-the-art systems like RocksDB (11.75x), SplinterDB (4.52x), KVell (10.56x), LeanStore (2.04x), and FASTER (2.38x). F2 also maintains its high performance across varying workload skewness levels and memory budgets, while achieving low disk write amplification.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com