Reliability Analysis of Fault Tolerant Memory Systems (2311.12849v2)
Abstract: This paper delves into a comprehensive analysis of fault-tolerant memory systems, focusing on recovery techniques modeled using Markov chains to address transient errors. The study revolves around the application of scrubbing methods in conjunction with Single Error Correction and Double Error Detection (SEC-DED) codes. It explores three primary models: 1) Exponentially distributed scrubbing, involving periodic checks of memory words within exponentially distributed time intervals; 2) Deterministic scrubbing, featuring regular, periodic word checks; and 3) Mixed scrubbing, which combines both probabilistic and deterministic scrubbing approaches. The research encompasses the estimation of reliability and Mean Time to Failure (MTTF) values for each model. Notably, the findings highlight the superior performance of mixed scrubbing over simpler scrubbing methods in terms of reliability and MTTF.
- J. Samanta, J. Bhaumik, and S. Barman, “Compact and Power Efficient SEC-DED Codec for Computer Memory,” Microsystem Technologies, vol. 27, pp. 359–368, 2021.
- V. Vlagkoulis, A. Sari, G. Antonopoulos, M. Psarakis, A. Tavoularis, G. Furano, C. Boatella-Polo, C. Poivey, V. Ferlet-Cavrois, M. Kastriotou et al., “Configuration Memory Scrubbing of SRAM-Based FPGAs Using a Mixed 2-D Coding Technique,” IEEE Transactions on Nuclear Science, vol. 69, no. 4, pp. 871–882, 2022.
- A. Cook, A. Nicholson, H. Janicke, L. Maglaras, and R. Smith, “Attribution of Cyber Attacks on Industrial Control Systems,” EAI Endorsed Transactions on Industrial Networks and Intelligent Systems, vol. 3, no. 7, p. e3, Apr. 2016. [Online]. Available: https://publications.eai.eu/index.php/inis/article/view/458
- T. Kwon, M. Imran, and J.-S. Yang, “Reliability enhanced heterogeneous phase change memory architecture for performance and energy efficiency,” IEEE Transactions on Computers, vol. 70, no. 9, pp. 1388–1400, 2020.
- L. A. Maglaras, M. A. Ferrag, H. Janicke, N. Ayres, and L. Tassiulas, “Reliability, Security, and Privacy in Power Grids,” Computer, vol. 55, no. 9, pp. 85–88, 2022.
- A. Saleh, J. Serrano, and J. Patel, “Reliability of Scrubbing Recovery Techniques for Memory Systems,” IEEE Transactions on Reliability, vol. 39, no. 1, pp. 114–122, 1990.
- J.-C. Baraza-Calvo, J. Gracia-Morán, L.-J. Saiz-Adalid, D. Gil-Tomás, and P.-J. Gil-Vicente, “Proposal of an adaptive fault tolerance mechanism to tolerate intermittent faults in RAM,” Electronics, vol. 9, no. 12, p. 2074, 2020.
- L. Maglaras, M. A. Ferrag, H. Janicke, W. Buchanan, and L. Tassiulas, “Bridging the Gap between Cybersecurity and Reliability for Critical National Infrastructures,” in THE BRIDGE, vol. 119. The Magazine of IEEE-Eta Kappa Nu, 2023, pp. 14–19.
- S. Scargall and S. Scargall, “Reliability, Availability, and Serviceability (RAS),” Programming Persistent Memory: A Comprehensive Guide for Developers, pp. 333–346, 2020.
- L. Maglaras, H. Janicke, and M. A. Ferrag, “Combining security and reliability of critical infrastructures: The concept of securability,” p. 10387, 2022.
- Y. Yigit, C. Chrysoulas, G. Yurdakul, L. Maglaras, and B. Canberk, “Digital Twin-Empowered Smart Attack Detection System for 6G Edge of Things Networks,” in 2023 IEEE Globecom Workshops (GC Wkshps), 2023.
- Y. Yigit, K. Huseynov, H. Ahmadi, and B. Canberk, “YA-DA: YAng-Based DAta Model for Fine-Grained IIoT Air Quality Monitoring,” in 2022 IEEE Globecom Workshops (GC Wkshps), 2022, pp. 438–443.
- Y. Yigit, L. D. Nguyen, M. Ozdem, O. K. Kinaci, T. Hoang, B. Canberk, and T. Q. Duong, “TwinPort: 5G Drone-assisted Data Collection with Digital Twin for Smart Seaports,” Scientific Reports, vol. 13, p. 12310, 2023.
- M. A. Ferrag, L. Maglaras, H. Janicke, and R. Smith, “Deep Learning Techniques for Cyber Security Intrusion Detection: A Detailed Analysis,” in 6th International Symposium for ICS & SCADA Cyber Security Research 2019 (ICS-CSR), 9 2019.
- M. Dagli, S. Keskin, Y. Yigit, and A. Kose, “Resiliency Analysis of ONOS and Opendaylight SDN Controllers Against Switch and Link Failures,” in 2020 Fifth International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN), 2020, pp. 149–153.
- G. Secinti, P. B. Darian, B. Canberk, and K. R. Chowdhury, “Resilient end-to-end Connectivity for Software-defined Unmanned Aerial Vehicular Networks,” in 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), 2017, pp. 1–5.
- L. Maglaras, “From Mean Time to Failure to Mean Time to Attack/Compromise: Incorporating Reliability into Cybersecurity,” p. 159, 2022.
- D. A. Santos, A. M. P. Mattos, D. R. Melo, and L. Dilillo, “Enhancing Fault Awareness and Reliability of a Fault-Tolerant RISC-V System-on-Chip,” Electronics, vol. 12, no. 12, p. 2557, Jun 2023. [Online]. Available: http://dx.doi.org/10.3390/electronics12122557