COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation

Published 29 May 2026 in cs.AI, cs.CL, and cs.LG | (2605.31264v1)

Abstract: LLM agents are increasingly expected not only to complete isolated tasks, but also to carry bounded representations of human expertise, judgment, and interaction style. Building such person-grounded agents remains difficult because actionable knowledge associated with a person or role is usually embedded in heterogeneous traces rather than written as clean instructions. Existing memory and persona systems capture fragments of this evidence, while skill frameworks provide portable packaging formats; however, there is no end-to-end workflow for distilling these traces into inspectable, correctable, and agent-usable skills. We present an automated trace-to-skill distillation system for generating person-grounded AI skills via expert knowledge distillation. Given materials from a target person or role, COLLEAGUE.SKILL produces a versioned skill package with two coordinated tracks: a capability track for practices, mental models, and decision heuristics, and a bounded behavior track for communication style, interaction rules, and correction history. The package can be inspected, invoked, updated through natural-language feedback, rolled back, installed across agent hosts, and optionally prepared for controlled distribution. We describe the artifact contract, generation workflow, correction lifecycle, deployment surface, and domain presets implemented in the open-source system. At the time of writing, the public repository has approximately 18.5k GitHub stars; the gallery lists 215 skills from 165 contributors and more than 100k cumulative stars across listed skill cards. The system illustrates how person-grounded skills can be represented as portable, correctable packages rather than opaque prompts or hidden memories.

Abstract PDF Upgrade to Chat

Authors (5)

Summary

The paper presents a rigorous framework that extracts and packages person-grounded skills from varied human activity logs into auditable artifacts.
It utilizes a dual representation architecture to separate capabilities from persona, ensuring skills remain correctable and governed by explicit metadata.
The system demonstrates practical deployment with modular presets for work, public figures, and relationships while supporting full lifecycle management and version control.

COLLEAGUE.SKILL: Automated Distillation of Person-Grounded AI Skills

Motivation and Problem Formulation

COLLEAGUE.SKILL introduces a principled methodology for extracting person-grounded "skills" from heterogeneous traces of human activity, such as chat logs, code reviews, and emails, and packaging them into portable, inspectable, and correctable artifacts usable by LLM-based agents (2605.31264). The core motivation moves beyond persona simulation and memory-augmented agents by explicitly addressing the need for bounded, auditable transfer of human expertise, decision heuristics, and interaction style across work, public-figure, and relationship domains under explicit evidence and governance constraints.

The formulation treats skill generation as an artifact construction problem. Given a target person or role, defined evidence scope, and supporting materials, the system outputs a versioned skill package adhering to five criteria: portability (skills can be loaded by compatible agents), inspectability (users review contents, judgments, limits), composability (capability and persona separated and individually callable), correctability (natural-language feedback updates the skill), and governability (explicit metadata, provenance, and distribution controls).

System Architecture and Artifact Model

The paper details a modular, extensible pipeline for automated skill distillation with a dual representation architecture (Figure 1). Input traces are pre-processed, parsed, and normalized. The system extracts durable capabilities (practices, heuristics, decision models) and bounded behavioral style (interaction rules, corrections history) into structured Markdown files. These are packaged with machine-readable metadata, version controls, and installers compatible with agent hosts like Claude Code, Codex, and Hermes.

Figure 1: COLLEAGUE.SKILL architecture for automated person-grounded skill generation. The shared distillation core renders portable agent-skill artifacts; domain presets add source requirements, evidence checks, consent assumptions, and lifecycle or gallery metadata.

Artifacts include:

SKILL.md: Combined skill with explicit structure (frontmatter, capabilities, persona constraints)
work.md, persona.md: Editable source documents for capability and style, independent invocation
manifest.json, meta.json: Metadata for installation, host compatibility, lifecycle state, provenance, correction log

By enforcing a schema and standardizing entrypoints, the system enables skills to be individually auditable, shareable, revisable, and revocable.

Application Domains and Presets

The system provides three primary domain presets (Figure 2):

Colleague: Default; distills work practices, decision heuristics, review standards from enterprise traces.
Celebrity/Public-Figure: Extracts public evidence (speeches, interviews, writings), encodes source boundaries, and explicit inferences.
Relationship: Models private interaction patterns, emphasizes local control, consent, and strict privacy.

Each preset reuses the same distillation architecture with domain-specific prompt tuning, evidence requirements, and consent assumptions, demonstrating extensibility.

Figure 2: Application presets layered on the COLLEAGUE.SKILL person-grounded skill pipeline. The shared artifact workflow branches into colleague, celebrity, and relationship presets with different evidence scopes, governance requirements, and invocation aliases.

Correction, Versioning, and Lifecycle Management

COLLEAGUE.SKILL provides a full artifact lifecycle, supporting versioning, rollbacks, and iterative refinement through natural-language feedback (Figure 3). Corrections generate targeted Markdown patches or explicit corrections records, automatically updating the relevant artifact sections, incrementing the version, and preserving rollback points for auditability and reversion if needed.

Figure 3: Lifecycle loop for generated skills. Corrections and patches create new versions while preserving rollback points.

The system differentiates between capability corrections (e.g., revised review heuristics) and behavior corrections (e.g., interaction style or refusal patterns), producing maintainable change logs and making artifacts tractable for collaborative improvement and external audit.

Deployment, Gallery Distribution, and Community Metrics

Public deployment is facilitated through an open-source repository, skill gallery, and explicit metadata surfaces. As of 2026-05-28, the system had 18.5k GitHub stars, 215 skills, and 165 contributors, with skills accruing over 100k public stars—evidence of substantial ecosystem engagement (Figure 4).

Figure 4: Observed public deployment counters on 2026-05-28. Counts summarize repository activity, gallery scale, and cumulative public signals; they indicate deployment and distribution surface rather than task performance, behavioral fidelity, or adoption-quality metrics.

Deployment emphasizes artifact portability between agent hosts, modular skill installation, and controlled public sharing. Gallery publication is opt-in, subject to explicit evidence, rights, and governance metadata.

COLLEAGUE.SKILL advances the state of agent-craftable skills beyond prior art in in-context memory [packersMemGPT2023], tool-augmented agents [schickToolformer2023], and skill libraries [wangVoyager2023, skillx2026, maSkillGen2026, yangAutoSkill2026]. It does not attempt behavioral cloning or open-ended persona simulation [shaoCharacterLLM2023, wangRoleLLM2024, parkGenerativeAgents2023], remaining focused on artifact-level standards for reviewable, correctable, and audited knowledge transfer. This explicit separation of factual skill, procedural heuristics, and bounded interaction behavior addresses persistent shortcomings in prompt-only or memory-augmented personalization frameworks.

Limitations and Responsible Use

The authors delineate the artifact claim: COLLEAGUE.SKILL does not claim to produce high-fidelity person simulations or guarantee enhanced downstream task performance; these remain for future human-in-the-loop studies. The risks around editor bias, emotional overattachment, consent, and privacy are acknowledged, especially in sensitive domains like relationships and public figures. The artifact's engendered transparency, governability, and versioning are positioned as mitigations, but they do not obviate the need for rigorous source governance and optionality for users.

Implications and Future Directions

The artifact-centric approach to distilling person-grounded skills represents a pivot from black-box prompts or monolithic agent memory to explicit, composable, and auditable capability units. This product-oriented framing has both practical and theoretical significance: it enables transparent skill transfer across agent platforms, facilitates remediation and deletion (critical for regulatory compliance and enterprise trust), and provides concrete handles for benchmarking extraction quality, provenance, and governance.

Future research can extend in several important directions:

Empirical benchmarking of skill artifact utility versus behavioral fidelity to source individuals in realistic tasks
Expansion of artifact schemas to support collaborative skill co-evolution, granular access controls, and automated safety auditing
Systematic evaluation of risk/utility trade-offs in capability-only vs. persona-only vs. combined artifacts across domains
Integration with next-generation multi-agent collaboration and distributed skill marketplaces

Conclusion

COLLEAGUE.SKILL operationalizes the concept of person-grounded skill distillation as an explicit, correctable, and portable artifact construction workflow (2605.31264). Its key contributions are the dual-track artifact model, full lifecycle management, and standardized deployment surface, supporting both private installation and repository-based sharing. By scoping skill extraction as artifact engineering—rather than simulation—COLLEAGUE.SKILL enables controlled, auditable, and iterative knowledge transfer across agentic systems. This framework sets a precedent for future digital doubles and agent extension paradigms that foreground transparency, provenance, correction, and governance over black-box imitation.

Markdown Report Issue