Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CHAOS: Accurate and Realtime Detection of Aging-Oriented Failure Using Entropy (1502.00781v1)

Published 3 Feb 2015 in cs.OH

Abstract: Even well-designed software systems suffer from chronic performance degradation, also named "software aging", due to internal (e.g. software bugs) and external (e.g. resource exhaustion) impairments. These chronic problems often fly under the radar of software monitoring systems before causing severe impacts (e.g. system failure). Therefore it's a challenging issue how to timely detect these problems to prevent system crash. Although a large quantity of approaches have been proposed to solve this issue, the accuracy and effectiveness of these approaches are still far from satisfactory due to the insufficiency of aging indicators adopted by them. In this paper, we present a novel entropy-based aging indicator, Multidimensional Multi-scale Entropy (MMSE). MMSE employs the complexity embedded in runtime performance metrics to indicate software aging and leverages multi-scale and multi-dimension integration to tolerate system fluctuations. Via theoretical proof and experimental evaluation, we demonstrate that MMSE satisfies Stability, Monotonicity and Integration which we conjecture that an ideal aging indicator should have. Based upon MMSE, we develop three failure detection approaches encapsulated in a proof-of-concept named CHAOS. The experimental evaluations in a Video on Demand (VoD) system and in a real-world production system, AntVision, show that CHAOS can detect the failure-prone state in an extraordinarily high accuracy and a near 0 Ahead-Time-To-Failure (ATTF). Compared to previous approaches, CHAOS improves the detection accuracy by about 5 times and reduces the ATTF even by 3 orders of magnitude. In addition, CHAOS is light-weight enough to satisfy the realtime requirement.

Citations (2)

Summary

We haven't generated a summary for this paper yet.