AgentFM: Role-Aware Failure Management for Distributed Databases with LLM-Driven Multi-Agents (2504.06614v1)
Abstract: Distributed databases are critical infrastructures for today's large-scale software systems, making effective failure management essential to ensure software availability. However, existing approaches often overlook the role distinctions within distributed databases and rely on small-scale models with limited generalization capabilities. In this paper, we conduct a preliminary empirical study to emphasize the unique significance of different roles. Building on this insight, we propose AgentFM, a role-aware failure management framework for distributed databases powered by LLM-driven multi-agents. AgentFM addresses failure management by considering system roles, data roles, and task roles, with a meta-agent orchestrating these components. Preliminary evaluations using Apache IoTDB demonstrate the effectiveness of AgentFM and open new directions for further research.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Collections
Sign up for free to add this paper to one or more collections.