OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code (2405.15568v2)

Published 24 May 2024 in cs.AI

Abstract: Open-ended and AI-generating algorithms aim to continuously generate and solve increasingly complex tasks indefinitely, offering a promising path toward more general intelligence. To accomplish this grand vision, learning must occur within a vast array of potential tasks. Existing approaches to automatically generating environments are constrained within manually predefined, often narrow distributions of environment, limiting their ability to create any learning environment. To address this limitation, we introduce a novel framework, OMNI-EPIC, that augments previous work in Open-endedness via Models of human Notions of Interestingness (OMNI) with Environments Programmed in Code (EPIC). OMNI-EPIC leverages foundation models to autonomously generate code specifying the next learnable (i.e., not too easy or difficult for the agent's current skill set) and interesting (e.g., worthwhile and novel) tasks. OMNI-EPIC generates both environments (e.g., an obstacle course) and reward functions (e.g., progress through the obstacle course quickly without touching red objects), enabling it, in principle, to create any simulatable learning task. We showcase the explosive creativity of OMNI-EPIC, which continuously innovates to suggest new, interesting learning challenges. We also highlight how OMNI-EPIC can adapt to reinforcement learning agents' learning progress, generating tasks that are of suitable difficulty. Overall, OMNI-EPIC can endlessly create learnable and interesting environments, further propelling the development of self-improving AI systems and AI-Generating Algorithms. Project website with videos: https://dub.sh/omniepic

PDF HTML Abstract

OMNI-EPIC: Open-endedness via Models of Human Notions of Interestingness with Environments Programmed in Code

The paper "OMNI-EPIC: Open-endedness via Models of Human Notions of Interestingness with Environments Programmed in Code" presents a novel framework designed for generating an endless stream of diverse and progressively challenging tasks for reinforcement learning (RL) agents. This approach leverages foundation models to automatically create a wide array of learning environments in code, addressing the limitations of previous methods constrained by narrow, predefined distributions of tasks.

Overview

Omni-EPIC stands out by combining the strengths of Open-endedness via Models of human Notions of Interestingness (OMNI) with Environments Programmed in Code (EPIC). Unlike previous models limited to narrow task spaces, OMNI-EPIC harnesses foundation models to generate code that specifies environments and reward functions dynamically. These environments are tailored to be both novel and solvable, progressively challenging the learning agents.

Methodology

The key components of OMNI-EPIC include:

Task Archive (Section 3.1): Maintains an expanding collection of successfully learned and failed tasks.
Task Generator (Section 3.2): Utilizes LLMs to create new, interesting tasks based on similarities to previously encountered tasks.
Environment Generator (Section 3.3): Converts task descriptions into executable code that defines the learning environment.
Model of Interestingness (Section 3.4): Evaluates the novelty and worthiness of generated tasks using LLMs.
Training Agents with RL (Section 3.5): Applies reinforcement learning to train agents within these generated environments.
Success Detector (Section 3.6): Automatically assesses task completion using LLMs or Vision-LLMs (VLMs).

Results

OMNI-EPIC's efficacy is demonstrated through two paradigms: a long run without agent training for rapid task generation and a short run with RL training to evaluate real learning progress.

Long Run without Training:
- Generated an expansive set of tasks, showing a diverse spectrum of challenges ranging from straightforward navigation to complex object interactions.
- Figure 2 in the paper showcases the evolution of tasks over 200 iterations, highlighting the algorithm's ability to create novel and diverse challenges.
Short Run with Training:
- Demonstrated the generation of increasingly complex yet solvable tasks tailored to the agent's capabilities.
- The RL agents trained on these tasks showed progressive learning, illustrating the effectiveness of OMNI-EPIC in creating a developmental curriculum.
- Figure 3 provides visual evidence of the tasks and the respective trained agents.

Implications and Future Developments

OMNI-EPIC significantly extends the capabilities of AI-GAs (AI-generating algorithms) by demonstrating a scalable approach to open-ended environment generation. Notably, it advances towards achieving Darwin Completeness, the potential to generate any possible learning environment.

Practical Implications:

The ability to generate diverse learning environments can enhance the training of generalist AI agents capable of adapting to a wide range of tasks.

Theoretical Implications:

This approach highlights the importance of task diversity and the dynamic adaptation of learning environments for the development of general intelligence in AI systems.

Future Developments:

Future work could explore the use of increasingly sophisticated foundation models to further expand the capabilities of OMNI-EPIC.
Enhancing the success detector with more advanced VLMs could improve the accuracy and reliability of task completion assessments.
Developing methods for integrating this framework with real-world applications, such as robotics and virtual simulations, could also be an exciting avenue for research.

Conclusion

OMNI-EPIC represents a significant advancement in the creation of open-ended, automatically generated tasks for reinforcement learning. By integrating models of human notions of interestingness and environments programmed in code, this framework offers a robust approach to developing self-improving AI systems. The contributions of this work pave the way for future research towards achieving general intelligence and understanding the fundamental nature of creativity and learning in artificial agents.