Nomad: Non-Exclusive Memory Tiering via Transactional Page Migration (2401.13154v2)
Abstract: With the advent of byte-addressable memory devices, such as CXL memory, persistent memory, and storage-class memory, tiered memory systems have become a reality. Page migration is the de facto method within operating systems for managing tiered memory. It aims to bring hot data whenever possible into fast memory to optimize the performance of data accesses while using slow memory to accommodate data spilled from fast memory. While the existing research has demonstrated the effectiveness of various optimizations on page migration, it falls short of addressing a fundamental question: Is exclusive memory tiering, in which a page is either present in fast memory or slow memory, but not both simultaneously, the optimal strategy for tiered memory management? We demonstrate that page migration-based exclusive memory tiering suffers significant performance degradation when fast memory is under pressure. In this paper, we propose non-exclusive memory tiering, a page management strategy that retains a copy of pages recently promoted from slow memory to fast memory to mitigate memory thrashing. To enable non-exclusive memory tiering, we develop Nomad, a new page management mechanism for Linux that features transactional page migration and page shadowing. Nomad helps remove page migration off the critical path of program execution and makes migration completely asynchronous. Evaluations with carefully crafted micro-benchmarks and real-world applications show that Nomad is able to achieve up to 6x performance improvement over the state-of-the-art transparent page placement (TPP) approach in Linux when under memory pressure. We also compare Nomad with a recently proposed hardware-assisted, access sampling-based page migration approach and demonstrate Nomad's strengths and potential weaknesses in various scenarios.
- https://www.computeexpresslink.org/.
- Autonuma: the other approach to numa scheduling. https://lwn.net/Articles/488709/.
- Damon-based reclamation. https://docs.kernel.org/admin-guide/mm/damon/reclaim.html#:~:text=DAMON%2Dbased%20Reclamation%20(DAMON_RECLAIM),of%20memory%20pressure%20and%20requirements.
- https://blocksandfiles.com/2023/11/20/accelerating-high-bandwidth-memory-to-light-speed/. https://blocksandfiles.com/2023/11/20/accelerating-high-bandwidth-memory-to-light-speed/.
- https://www.csie.ntu.edu.tw/ cjlin/libsvmtools/multicore-liblinear/.
- Intel agilex® 7 fpga and soc fpga. https://www.intel.com/content/www/us/en/products/details/fpga/agilex/7.html.
- Intel optane dimm. https://www.intel.com/content/www/us/en/architecture-and-technology/optane-dc-persistent-memory.html.
- Pagerank. https://github.com/sbeamer/gapbs.
- Pagerank wiki. https://en.wikipedia.org/wiki/PageRank.
- Redis. https://redis.io/.
- Ycsb. https://github.com/brianfrankcooper/YCSB.
- Caching less for better performance: Balancing cache size and update cost of flash memory cache in hybrid storage systems. In 10th USENIX Conference on File and Storage Technologies (FAST 12) (San Jose, CA, Feb. 2012), USENIX Association.
- Flatflash: Exploiting the byte-accessibility of ssds within a unified memory-storage hierarchy. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (2019), pp. 971–985.
- Thermostat: Application-transparent page management for two-tiered main memory. SIGPLAN Not. 52, 4 (apr 2017), 631–644.
- Lbica: A load balancer for i/o cache architectures. In 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE) (2019), pp. 1196–1201.
- Idio: Orchestrating inbound network data on server processors. IEEE Computer Architecture Letters 20, 1 (2021), 30–33.
- Reconsidering os memory optimizations in the presence of disaggregated memory. In Proceedings of the 2022 ACM SIGPLAN International Symposium on Memory Management (New York, NY, USA, 2022), ISMM 2022, Association for Computing Machinery, p. 1–14.
- Concurrent page migration for mobile systems with os-managed hybrid memory. In Proceedings of the 11th ACM Conference on Computing Frontiers (New York, NY, USA, 2014), CF ’14, Association for Computing Machinery.
- Exploiting Gray-Box knowledge of Buffer-Cache management. In 2002 USENIX Annual Technical Conference (USENIX ATC 02) (Monterey, CA, June 2002), USENIX Association.
- Hystor: Making the best use of solid state drives in high performance storage systems. In Proceedings of the international conference on Supercomputing (2011), pp. 22–32.
- Batman: Techniques for maximizing system bandwidth of memory systems with stacked-dram. In Proceedings of the International Symposium on Memory Systems (New York, NY, USA, 2017), MEMSYS ’17, Association for Computing Machinery, p. 268–280.
- Towards an adaptable systems architecture for memory tiering at warehouse-scale. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3 (New York, NY, USA, 2023), ASPLOS 2023, Association for Computing Machinery, p. 727–741.
- Storage-Aware caching: Revisiting caching for heterogeneous storage systems. In Conference on File and Storage Technologies (FAST 02) (Monterey, CA, Jan. 2002), USENIX Association.
- Cost effective storage using extent based dynamic tiering. In 9th USENIX Conference on File and Storage Technologies (FAST 11) (San Jose, CA, Feb. 2011), USENIX Association.
- Flash caching on the storage client. In 2013 USENIX Annual Technical Conference (USENIX ATC 13) (San Jose, CA, June 2013), USENIX Association, pp. 127–138.
- DULO: An effective buffer cache management scheme to exploit both temporal and spatial localities. In 4th USENIX Conference on File and Storage Technologies (FAST 05) (San Francisco, CA, Dec. 2005), USENIX Association.
- Hbm (high bandwidth memory) dram technology and architecture. In 2017 IEEE International Memory Workshop (IMW) (2017), IEEE, pp. 1–4.
- Jung, M. Hello bytes, bye blocks: Pcie storage meets compute express link for memory expansion (cxl-ssd). HotStorage ’22, Association for Computing Machinery, p. 45–51.
- Exploring the design space of page management for Multi-Tiered memory systems. In 2021 USENIX Annual Technical Conference (USENIX ATC 21) (July 2021), USENIX Association, pp. 715–728.
- Hybridstore: A cost-efficient, high-performance storage system combining ssds and hdds. In 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems (2011), pp. 227–236.
- Write policies for host-side flash caches. In 11th USENIX Conference on File and Storage Technologies (FAST 13) (San Jose, CA, Feb. 2013), USENIX Association, pp. 45–58.
- Centaur: Host-side ssd caching for storage performance control. In 2015 IEEE International Conference on Autonomic Computing (2015), pp. 51–60.
- Strata: A cross media file system. In Proceedings of the 26th Symposium on Operating Systems Principles (2017), pp. 460–477.
- Memtis: Efficient memory tiering with dynamic page classification and page size determination. In Proceedings of the 29th Symposium on Operating Systems Principles (New York, NY, USA, 2023), SOSP ’23, Association for Computing Machinery, p. 17–34.
- Pond: Cxl-based memory pooling systems for cloud platforms. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2 (2023), pp. 574–587.
- P2CACHE: Exploring tiered memory for In-Kernel file systems caching. In 2023 USENIX Annual Technical Conference (USENIX ATC 23) (Boston, MA, July 2023), USENIX Association, pp. 801–815.
- Synergistic coupling of ssd and hard disk for qos-aware virtual memory. In 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) (2013), pp. 24–33.
- Challenges in heterogeneous die-stacked and off-chip memory systems.
- Multi-clock: Dynamic tiering for hybrid memory systems. In 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA) (Los Alamitos, CA, USA, apr 2022), IEEE Computer Society, pp. 925–937.
- Tpp: Transparent page placement for cxl-enabled tiered-memory. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3 (New York, NY, USA, 2023), ASPLOS 2023, Association for Computing Machinery, p. 742–755.
- Heterogeneous memory architectures: A hw/sw approach for mixing die-stacked and off-package memories. In 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) (2015), pp. 126–136.
- Optimizing memory-mapped {{\{{I/O}}\}} for fast storage devices. In 2020 USENIX Annual Technical Conference (USENIX ATC 20) (2020), pp. 813–827.
- {{\{{AIFM}}\}}:{{\{{High-Performance}}\}},{{\{{Application-Integrated}}\}} far memory. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20) (2020), pp. 315–332.
- A mostly-clean dram cache for effective hit speculation and self-balancing dispatch. In 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture (2012), pp. 247–257.
- Extending ssd lifetimes with disk-based write caches. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (USA, 2010), FAST’10, USENIX Association, p. 8.
- Demystifying cxl memory with genuine cxl-ready systems and devices. ArXiv abs/2303.15375 (2023).
- The storage hierarchy is not a hierarchy: Optimizing caching on modern storage devices with orthus. In 19th USENIX Conference on File and Storage Technologies (FAST 21) (2021), pp. 307–323.
- Exploiting concurrency to improve latency and throughput in a hybrid storage system. In 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (2010), pp. 14–23.
- Characterizing the performance of intel optane persistent memory: a close look at its on-dimm buffering. In Proceedings of the Seventeenth European Conference on Computer Systems (2022), pp. 488–505.
- Nimble page management for tiered memory systems. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (New York, NY, USA, 2019), ASPLOS ’19, Association for Computing Machinery, p. 331–345.
- An empirical guide to the behavior and use of scalable persistent memory. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (USA, 2020), FAST’20, USENIX Association, p. 169–182.
- An empirical guide to the behavior and use of scalable persistent memory. In 18th USENIX Conference on File and Storage Technologies (FAST 20) (Santa Clara, CA, Feb. 2020), USENIX Association, pp. 169–182.
- I-cash: Intelligently coupled array of ssd and hdd. In 2011 IEEE 17th International Symposium on High Performance Computer Architecture (2011), pp. 278–289.
- Lingfeng Xiang (2 papers)
- Zhen Lin (31 papers)
- Weishu Deng (2 papers)
- Hui Lu (38 papers)
- Jia Rao (3 papers)
- Yifan Yuan (14 papers)
- Ren Wang (72 papers)