Papers
Topics
Authors
Recent
Search
2000 character limit reached

On Failure Diagnosis of the Storage Stack

Published 6 May 2020 in cs.OS | (2005.02547v2)

Abstract: Diagnosing storage system failures is challenging even for professionals. One example is the "When Solid State Drives Are Not That Solid" incident occurred at Algolia data center, where Samsung SSDs were mistakenly blamed for failures caused by a Linux kernel bug. With the system complexity keeps increasing, such obscure failures will likely occur more often. As one step to address the challenge, we present our on-going efforts called X-Ray. Different from traditional methods that focus on either the software or the hardware, X-Ray leverages virtualization to collects events across layers, and correlates them to generate a correlation tree. Moreover, by applying simple rules, X-Ray can highlight critical nodes automatically. Preliminary results based on 5 failure cases shows that X-Ray can effectively narrow down the search space for failures.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.