Papers
Topics
Authors
Recent
Search
2000 character limit reached

Asymptotic efficiency of restart and checkpointing

Published 21 Feb 2018 in math.PR and cs.PF | (1802.07455v2)

Abstract: Many tasks are subject to failure before completion. Two of the most common failure recovery strategies are restart and checkpointing. Under restart, once a failure occurs, it is restarted from the beginning. Under checkpointing, the task is resumed from the preceding checkpoint after the failure. We study asymptotic efficiency of restart for an infinite sequence of tasks, whose sizes form a stationary sequence. We define asymptotic efficiency as the limit of the ratio of the total time to completion in the absence of failures over the total time to completion when failures take place. Whether the asymptotic efficiency is positive or not depends on the comparison of the tail of the distributions of the task size and the random variables governing failures. Our framework allows for variations in the failure rates and dependencies between task sizes. We also study a similar notion of asymptotic efficiency for checkpointing when the task is infinite a.s. and the inter-checkpoint times are i.i.d.. Moreover, in checkpointing, when the failures are exponentially distributed, we prove the existence of an infinite sequence of universal checkpoints, which are always used whenever the system starts from any checkpoint that precedes them.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.