Stabilization and Fault-Tolerance in Presence of Unchangeable Environment Actions (1508.00864v1)
Abstract: We focus on the problem of adding fault-tolerance to an existing concurrent protocol in the presence of {\em unchangeable environment actions}. Such unchangeable actions occur in practice due to several reasons. One instance includes the case where only a subset of the components/processes can be revised and other components/processes must be as is. Another instance includes cyber-physical systems where revising physical components may be undesirable or impossible. These actions differ from faults in that they are simultaneously {\em assistive} and {\em disruptive}, whereas faults are only disruptive. For example, if these actions are a part of a physical component, their execution is essential for the normal operation of the system. However, they can potentially disrupt actions taken by other components for dealing with faults. Also, one can typically assume that fault actions will stop for a long enough time for the program to make progress. Such an assumption is impossible in this context. We present algorithms for adding stabilizing fault-tolerance, failsafe fault-tolerance and masking fault-tolerance. Interestingly, we observe that the previous approaches for adding stabilizing fault-tolerance and masking fault-tolerance cannot be easily extended in this context. However, we find that the overall complexity of adding these levels of fault-tolerance remains in P (in the state space of the program). We also demonstrate that our algorithms are sound and complete.