Augmenting Autotelic Agents with LLMs
The paper "Augmenting Autotelic Agents with LLMs" introduces a novel framework for enhancing the goal representation and generation capabilities of autotelic agents through the integration of pretrained LLMs (LMs). Autotelic agents, characterized by their ability to autonomously define and pursue their own goals within a given environment, present a unique opportunity to explore unbounded skill spaces without relying on predefined goal constructs. Leveraging the common-sense reasoning and cultural knowledge embedded in large LMs, the proposed agents aim to generate and execute a diverse array of human-relevant goals in a task-agnostic, text-based simulation, specifically CookingWorld.
The research implemented a LLM Augmented Autotelic Agent (LMA3) architecture comprising three core components: an LM Goal Generator, LM Relabeler, and LM Reward Function. Without relying on hard-coded goal representations or curriculum, LMA3 shows that agents can learn a broad spectrum of skills. The overall system is designed to address the current limitation of many artificial agents that operate within the confines of predefined goal spaces, which can either be overly restricted or impractical due to their unbounded nature. By utilizing the LM's capabilities, LMA3 agents autonomously form new goal abstractions and create novel, complex task chains.
Key elements of the LMA3 framework involve generating goals and potential subgoals from past experiences and current capabilities, a capacity analogous to human cultural transmission. The LM Goal Generator exploits historical trajectories to suggest achievable high-level goal structures and potential sub-component challenges, serving as a form of creative problem composition. The LM Relabeler focuses on restructuring past experiences into new sets of goal descriptors, supporting the generation of diverse learning experiences from a single trajectory. The LM Reward Function subsequently evaluates these trajectories for coherence and completion of the expressed goals.
The empirical evaluation demonstrated that LMA3 could achieve a high success rate across a varied and human-relevant goal space. The experimental design illustrates significant improvements in goal diversity, abstraction, and innovation capabilities when augmenting traditional autotelic agents with LMs, compared to baseline methods lacking in linguistic augmentation or relying heavily on predefined goal spaces. Moreover, LMA3 was shown to discover and master thousands of distinct goal representations in the CookingWorld environment, a feat unmatched by previous methods.
In focusing on language-driven goal formulation, the paper posits that LMs offer an effective means of simulating human-like cultural knowledge necessary for open-ended learning. Although LMA3 presents advancements towards more autonomous and generative AI, the paper acknowledges that future developments in reinforcement learning and environment complexity are necessary to build truly open-ended agents. Additionally, the economic and computational feasibility of large scale LM integration remains a concern for deploying such frameworks.
Overall, the introduction of LMA3 represents a promising step towards more adaptable and self-directed agents capable of evolving in concert with complex environments, emphasizing the transformative potential of integrating LMs in the field of autonomous agent development and artificial general intelligence. Future work could explore multimodal environments and further refine the integration strategy for LLMs to enhance practicality and effectiveness across diverse operational contexts.