Agentic Lybic: Adaptive Multi-Agent FSM
- Agentic Lybic is a multi-agent execution system structured as an FSM that enables dynamic orchestration and continuous quality assessment for desktop automation.
- It employs a four-level hierarchy with specialized roles (Controller, Manager, Workers, Evaluator) to manage complex, multi-step workflows effectively.
- Empirical evaluations demonstrate enhanced reliability and efficiency, achieving a state-of-the-art 57.07% success rate on standardized desktop tasks.
Agentic Lybic refers to a multi-agent execution system organized as a finite-state machine (FSM) that enables dynamic orchestration, tiered reasoning, and robust quality control for complex desktop automation tasks (Guo et al., 14 Sep 2025). The architecture is specifically engineered to address the limitations of prior agent-based automation frameworks in coordinating multi-step operations, managing error recovery, and adaptively optimizing execution strategies. Its deployment has established new performance benchmarks for generalized desktop automation, demonstrating the impact of principled multi-agent orchestration combined with continuous feedback-driven quality assessment.
1. Finite-State Machine (FSM) Architecture
Agentic Lybic executes all orchestration and control logic via an explicit FSM that routes between well-defined controller situations: REPLAN, SUPPLEMENT, GET_ACTION, QUALITY_CHECK, FINAL_CHECK, and EXECUTE_ACTION. The global state encodes task, subtask, execution, and controller status; transitions are governed by trigger codes alongside the actions () taken and observations () received:
This FSM organizes the multi-agent assembly, dynamically switching between planning, action, verification, and replanning phases as new signals, failures, or feedback arise.
2. Component Roles and Tiered Reasoning
The system is composed as a four-level hierarchy:
- Controller: Central routing entity maintaining the global state and drive of all transitions, informed by a coded trigger table. It ensures that status updates (e.g., error, subtask readiness) prompt appropriate FSM transitions.
- Manager: Responsible for decomposing user objectives into subtasks, constructing a Directed Acyclic Graph (DAG) of dependencies, and adjusting plans through lightweight, medium, or heavyweight replanning routines.
- Worker Subsystem:
- Technician: Executes system commands (file ops, scripting, terminal) via code-driven interfaces.
- Operator: Handles GUI tasks (clicking, typing, scrolling) using vision-LLMs for environment navigation.
- Analyst: Mediates complex decision support, interpreting ambiguous states or questions and supporting resolution through reasoning.
- Evaluator: Implements continuous quality gating through multiple triggers (PERIODIC_CHECK, WORKER_STALE, WORKER_SUCCESS), compares visual/semantic states using similarity and progress functions, and signals gate decisions (, , , ).
Component | Key Functions | Triggered FSM States |
---|---|---|
Controller | Global state management, routing | All (based on transitions) |
Manager | Task decomposition, replanning | REPLAN, SUPPLEMENT |
Worker(S) | Task execution (code, GUI, analysis) | GET_ACTION, EXECUTE_ACTION |
Evaluator | Multi-trigger quality check, progress monitoring | QUALITY_CHECK, FINAL_CHECK, error recovery |
3. FSM-Based Routing and Adaptivity
Unlike “delegate-and-forget” models, routing in Agentic Lybic is a continuous process supported by the FSM:
- Subtasks prompt GET_ACTION transitions, assigning Worker roles dynamically based on task modality.
- Errors (CANNOT_EXECUTE) and stagnation signals invoke QUALITY_CHECK or REPLAN, rerouting tasks to Manager for objective reevaluation or supplementary input.
- Evaluator feedback on similarity/progress functions and uncertainty assessments drives system recovery or continuation.
This adaptive routing supports flexible selection of optimal strategies for each subtask, generalizing execution across a diverse set of desktop environments.
4. Quality Control, Error Recovery, and Continuous Monitoring
Agentic Lybic implements robust and continuous quality assessment through the Evaluator:
- Periodic checks detect task stagnation.
- Worker state monitoring prevents repetition beyond the defined threshold.
- Error recovery leverages immediate replanning, reducing error propagation and maintaining execution reliability.
This framework allows proactive detection and management of execution anomalies, facilitating robust recovery and adaptive replanning in long-horizon, multi-application workflows.
5. Empirical Evaluation and Benchmarks
The system was evaluated on the OSWorld benchmark, covering 361 tasks distributed across browser, editor, and cross-application workflows. Notably, Agentic Lybic achieved a state-of-the-art 57.07% success rate within 50 steps, surpassing prior baselines (e.g., CoAct-1 at 56.39%, Agent S2.5 at 54.21%). Efficiency gains were achieved by reducing execution steps and increasing reliability, especially in complex task domains such as Chrome and Impress.
6. Applications and Implications
Agentic Lybic’s principled design expands the reach of desktop automation to generalized, multi-step operational environments:
- Desktop and Enterprise Automation: Automates diverse workflows in office environments with heterogeneous applications.
- Complex Computing Environments: Coordinates direct GUI and low-level script execution, applicable to IT operations and technical support pipelines.
- Framework for Autonomous Agents: Demonstrates scalable layered reasoning adaptable to broader real-world autonomous agent research.
Its architecture demonstrates that continuous-feedback FSM orchestration, layered role specialization, and principled error recovery substantially enhance the reliability and generalization of agentic multi-agent systems for desktop automation.
7. Significance and Future Outlook
Agentic Lybic establishes a reference architecture for multi-agent execution in dynamic desktop environments and highlights the value of stateful orchestration, explicit quality gating, and adaptive error recovery. Potential extensions include more granular role specialization, refinement of FSM granularity, automated discovery of optimal routing policies, and integration with cross-environment execution frameworks.
This systematic approach presents a pathway for generalized, reliable, and adaptive automatic execution pipelines in complex and uncertain computing environments, addressing long-standing deficits in agent coordination and continuous quality control.