Papers
Topics
Authors
Recent
2000 character limit reached

OXN -- Automated Observability Assessments for Cloud-Native Applications

Published 12 Jul 2024 in cs.SE | (2407.09644v1)

Abstract: Observability is important to ensure the reliability of microservice applications. These applications are often prone to failures, since they have many independent services deployed on heterogeneous environments. When employed "correctly", observability can help developers identify and troubleshoot faults quickly. However, instrumenting and configuring the observability of a microservice application is not trivial but tool-dependent and tied to costs. Practitioners need to understand observability-related trade-offs in order to weigh between different observability design alternatives. Still, these architectural design decisions are not supported by systematic methods and typically just rely on "professional intuition". To assess observability design trade-offs with concrete evidence, we advocate for conducting experiments that compare various design alternatives. Achieving a systematic and repeatable experiment process necessitates automation. We present a proof-of-concept implementation of an experiment tool - Observability eXperiment eNgine (OXN). OXN is able to inject arbitrary faults into an application, similar to Chaos Engineering, but also possesses the unique capability to modify the observability configuration, allowing for the straightforward assessment of design decisions that were previously left unexplored.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.