Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 65 tok/s

Gemini 2.5 Pro 47 tok/s Pro

GPT-5 Medium 39 tok/s Pro

GPT-5 High 32 tok/s Pro

GPT-4o 97 tok/s Pro

Kimi K2 164 tok/s Pro

GPT OSS 120B 466 tok/s Pro

Claude Sonnet 4 38 tok/s Pro

2000 character limit reached

SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills (2504.07079v1)

Published 9 Apr 2025 in cs.AI, cs.CL, and cs.CV

Abstract: To survive and thrive in complex environments, humans have evolved sophisticated self-improvement mechanisms through environment exploration, hierarchical abstraction of experiences into reuseable skills, and collaborative construction of an ever-growing skill repertoire. Despite recent advancements, autonomous web agents still lack crucial self-improvement capabilities, struggling with procedural knowledge abstraction, refining skills, and skill composition. In this work, we introduce SkillWeaver, a skill-centric framework enabling agents to self-improve by autonomously synthesizing reusable skills as APIs. Given a new website, the agent autonomously discovers skills, executes them for practice, and distills practice experiences into robust APIs. Iterative exploration continually expands a library of lightweight, plug-and-play APIs, significantly enhancing the agent's capabilities. Experiments on WebArena and real-world websites demonstrate the efficacy of SkillWeaver, achieving relative success rate improvements of 31.8% and 39.8%, respectively. Additionally, APIs synthesized by strong agents substantially enhance weaker agents through transferable skills, yielding improvements of up to 54.3% on WebArena. These results demonstrate the effectiveness of honing diverse website interactions into APIs, which can be seamlessly shared among various web agents.

Collections

Summary

Overview of "SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills"

The paper presents "SkillWeaver," a novel framework that enhances the self-improvement capabilities of autonomous web agents. The framework is designed to address challenges faced by digital environments that are marked by complexity and diversity. Specifically, SkillWeaver enables web agents to autonomously synthesize and refine reusable skills as APIs, subsequently expanding their capabilities. This process includes discovering skills on new websites, executing them for practice, and distilling practice experiences into robust APIs that are added to the agent’s library.

Key Methodological Contributions

SkillWeaver employs a three-stage process to facilitate autonomous skill improvement:

Skill Proposal: Autonomous agents systematically explore website environments to identify tasks that require procedural, navigational, or information-seeking skills. The selection is driven by the potential utility and complexity of the skills required.
Skill Synthesis: Practiced skills are transformed into reusable APIs in Python, encapsulating learned behaviors into structured formats. The synthesis involves integrating feedback from a reward model, which judges the success of skill executions.
Skill Honing: Synthesized APIs undergo rigorous testing and debugging to ensure reliability and effectiveness during inference, employing automatically generated test cases to validate robustness.

Experimental Results

Experiments were conducted using the WebArena benchmark, which mirrors real-world website interactions, and evaluations on live websites using the Online-Mind2Web benchmark. SkillWeaver demonstrated significant performance improvements. Notably, relative success rate gains of up to 54.3% were observed in weaker agents when equipped with APIs synthesized by more powerful agents. Moreover, the framework achieved an average success rate improvement of 39.8% across various websites.

Theoretical and Practical Implications

SkillWeaver’s approach underscores the value of autonomous exploration and contextual understanding within digital environments. By abstracting procedural knowledge into shareable APIs, the framework not only improves individual agent performance but also facilitates cross-agent knowledge transfer—highlighting potential in collaborative improvement among AI systems.

From a practical standpoint, the paper illustrates how web agents can enhance decision-making processes, making them better equipped to manage previously unseen environments. More complex skills could be synthesized as the training and basic computational capabilities of base agents evolve.

Future Developments

Looking forward, the integration of more sophisticated forms of environment understanding and dynamic adaptation is expected to further increase web agents' capabilities. As agents grow stronger and are better at programmatically handling complex tasks and long-term planning, SkillWeaver may evolve to support increasingly sophisticated interactions with digital environments.

By advancing web agent capabilities through self-improvement frameworks like SkillWeaver, the potential for automating complex workflows and enhancing user productivity in the digital field becomes increasingly feasible.