Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Augmenting Autotelic Agents with Large Language Models (2305.12487v1)

Published 21 May 2023 in cs.AI, cs.CL, and cs.LG

Abstract: Humans learn to master open-ended repertoires of skills by imagining and practicing their own goals. This autotelic learning process, literally the pursuit of self-generated (auto) goals (telos), becomes more and more open-ended as the goals become more diverse, abstract and creative. The resulting exploration of the space of possible skills is supported by an inter-individual exploration: goal representations are culturally evolved and transmitted across individuals, in particular using language. Current artificial agents mostly rely on predefined goal representations corresponding to goal spaces that are either bounded (e.g. list of instructions), or unbounded (e.g. the space of possible visual inputs) but are rarely endowed with the ability to reshape their goal representations, to form new abstractions or to imagine creative goals. In this paper, we introduce a LLM augmented autotelic agent (LMA3) that leverages a pretrained LLM (LM) to support the representation, generation and learning of diverse, abstract, human-relevant goals. The LM is used as an imperfect model of human cultural transmission; an attempt to capture aspects of humans' common-sense, intuitive physics and overall interests. Specifically, it supports three key components of the autotelic architecture: 1)~a relabeler that describes the goals achieved in the agent's trajectories, 2)~a goal generator that suggests new high-level goals along with their decomposition into subgoals the agent already masters, and 3)~reward functions for each of these goals. Without relying on any hand-coded goal representations, reward functions or curriculum, we show that LMA3 agents learn to master a large diversity of skills in a task-agnostic text-based environment.

Augmenting Autotelic Agents with LLMs

The paper "Augmenting Autotelic Agents with LLMs" introduces a novel framework for enhancing the goal representation and generation capabilities of autotelic agents through the integration of pretrained LLMs (LMs). Autotelic agents, characterized by their ability to autonomously define and pursue their own goals within a given environment, present a unique opportunity to explore unbounded skill spaces without relying on predefined goal constructs. Leveraging the common-sense reasoning and cultural knowledge embedded in large LMs, the proposed agents aim to generate and execute a diverse array of human-relevant goals in a task-agnostic, text-based simulation, specifically CookingWorld.

The research implemented a LLM Augmented Autotelic Agent (LMA3) architecture comprising three core components: an LM Goal Generator, LM Relabeler, and LM Reward Function. Without relying on hard-coded goal representations or curriculum, LMA3 shows that agents can learn a broad spectrum of skills. The overall system is designed to address the current limitation of many artificial agents that operate within the confines of predefined goal spaces, which can either be overly restricted or impractical due to their unbounded nature. By utilizing the LM's capabilities, LMA3 agents autonomously form new goal abstractions and create novel, complex task chains.

Key elements of the LMA3 framework involve generating goals and potential subgoals from past experiences and current capabilities, a capacity analogous to human cultural transmission. The LM Goal Generator exploits historical trajectories to suggest achievable high-level goal structures and potential sub-component challenges, serving as a form of creative problem composition. The LM Relabeler focuses on restructuring past experiences into new sets of goal descriptors, supporting the generation of diverse learning experiences from a single trajectory. The LM Reward Function subsequently evaluates these trajectories for coherence and completion of the expressed goals.

The empirical evaluation demonstrated that LMA3 could achieve a high success rate across a varied and human-relevant goal space. The experimental design illustrates significant improvements in goal diversity, abstraction, and innovation capabilities when augmenting traditional autotelic agents with LMs, compared to baseline methods lacking in linguistic augmentation or relying heavily on predefined goal spaces. Moreover, LMA3 was shown to discover and master thousands of distinct goal representations in the CookingWorld environment, a feat unmatched by previous methods.

In focusing on language-driven goal formulation, the paper posits that LMs offer an effective means of simulating human-like cultural knowledge necessary for open-ended learning. Although LMA3 presents advancements towards more autonomous and generative AI, the paper acknowledges that future developments in reinforcement learning and environment complexity are necessary to build truly open-ended agents. Additionally, the economic and computational feasibility of large scale LM integration remains a concern for deploying such frameworks.

Overall, the introduction of LMA3 represents a promising step towards more adaptable and self-directed agents capable of evolving in concert with complex environments, emphasizing the transformative potential of integrating LMs in the field of autonomous agent development and artificial general intelligence. Future work could explore multimodal environments and further refine the integration strategy for LLMs to enhance practicality and effectiveness across diverse operational contexts.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Cédric Colas (27 papers)
  2. Laetitia Teodorescu (8 papers)
  3. Pierre-Yves Oudeyer (95 papers)
  4. Xingdi Yuan (46 papers)
  5. Marc-Alexandre Côté (42 papers)
Citations (15)
X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com