Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Read Disturb Errors in MLC NAND Flash Memory (1805.03283v1)

Published 8 May 2018 in cs.AR

Abstract: This paper summarizes our work on experimentally characterizing, mitigating, and recovering read disturb errors in multi-level cell (MLC) NAND flash memory, which was published in DSN 2015, and examines the work's significance and future potential. NAND flash memory reliability continues to degrade as the memory is scaled down and more bits are programmed per cell. A key contributor to this reduced reliability is read disturb, where a read to one row of cells impacts the threshold voltages of unread flash cells in different rows of the same block. For the first time in open literature, this work experimentally characterizes read disturb errors on state-of-the-art 2Y-nm (i.e., 20-24 nm) MLC NAND flash memory chips. Our findings (1) correlate the magnitude of threshold voltage shifts with read operation counts, (2) demonstrate how program/erase cycle count and retention age affect the read-disturb-induced error rate, and (3) identify that lowering pass-through voltage levels reduces the impact of read disturb and extend flash lifetime. Particularly, we find that the probability of read disturb errors increases with both higher wear-out and higher pass-through voltage levels. We leverage these findings to develop two new techniques. The first technique mitigates read disturb errors by dynamically tuning the pass-through voltage on a per-block basis. Using real workload traces, our evaluations show that this technique increases flash memory endurance by an average of 21%. The second technique recovers from previously-uncorrectable flash errors by identifying and probabilistically correcting cells susceptible to read disturb errors. Our evaluations show that this recovery technique reduces the raw bit error rate by 36%.

Citations (183)

Summary

  • The paper presents a pioneering experimental characterization of read disturb errors in 2Y-nm MLC NAND flash chips.
  • The study demonstrates that dynamic pass-through voltage tuning extends flash endurance by 21% by mitigating threshold voltage shifts.
  • The proposed read disturb recovery method reduces raw bit error rates by 36%, providing actionable solutions for data integrity.

Analyzing Read Disturb Errors in MLC NAND Flash Memory

The paper presents an in-depth exploration of read disturb errors in state-of-the-art multi-level cell (MLC) NAND flash memory. Despite NAND flash memory's prominence due to its increasing capacity and decreasing cost per bit, there remain critical challenges pertaining to its reliability, particularly as scaling continues. A central contributor to reduced reliability is read disturb errors, where reading one row of cells inadvertently changes the threshold voltages of unread cells in other rows, leading to potential data corruption.

In a first of its kind characterization using 2Y-nm MLC NAND flash memory chips, this work opens unprecedented insights into the read disturb problem. The results show a direct correlation between the shifts in threshold voltages and the frequency of read operations, alongside notable impacts of program/erase cycle counts and retention age. Furthermore, the paper illuminates the critical role of pass-through voltage levels, revealing that lowering these levels can mitigate read disturb errors and extend the lifespan of flash memory by reducing errors.

Based on these experimental findings, the paper introduces two detailed techniques: dynamic pass-through voltage tuning and read disturb error recovery. Pass-through voltage tuning adjusts the voltage per block to mitigate errors effectively, resulting in an average enhancement of flash memory endurance by 21% using real workload data. Meanwhile, the second technique, read disturb recovery, offers a strategy for recovering data even when passing typical error correction capabilities by targeting and probabilistically correcting cells prone to read disturb errors. This approach reduces the raw bit error rate by 36%.

This research provides vital implications for both theory and practice in the area of flash memory technology. From a practical standpoint, the insights into read disturb errors offer clear pathways for immediate improvements in flash memory reliability and endurance. Theoretical implications lie in understanding and characterizing the conditions under which read disturb errors manifest, facilitating new research directions in error resilience. Moreover, with technology scaling likely to exacerbate these issues, addressing read disturb errors will remain crucial.

The paper's findings come at a crucial juncture for NAND flash memory, especially as flash densities increase and the industry braces for the inevitable return to smaller process technologies in 3D NAND contexts. The demonstrated pass-through voltage tuning mechanism exemplifies how existing architectural elements in NAND flash can be leveraged to offset predictably adverse scaling effects. Similarly, the proposed recovery mechanism serves as an intriguing model for error prediction and correction, potentially applicable across varied contexts of memory technology.

In conclusion, the paper propels forward the dialogue on NAND flash reliability by shedding light on a previously underappreciated source of errors and offering actionable solutions that can ease scaling concerns. The methodologies and insights herein should spark further work in characterizing and combating flash memory errors, ultimately influencing future designs and manufacturing processes. With the persistence of read disturb errors, innovative approaches such as described in the paper will be essential in ensuring the continued viability of NAND flash memory in diverse storage applications.