Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MarioGPT: Open-Ended Text2Level Generation through Large Language Models (2302.05981v3)

Published 12 Feb 2023 in cs.AI, cs.CL, and cs.LG

Abstract: Procedural Content Generation (PCG) is a technique to generate complex and diverse environments in an automated way. However, while generating content with PCG methods is often straightforward, generating meaningful content that reflects specific intentions and constraints remains challenging. Furthermore, many PCG algorithms lack the ability to generate content in an open-ended manner. Recently, LLMs have shown to be incredibly effective in many diverse domains. These trained LLMs can be fine-tuned, re-using information and accelerating training for new tasks. Here, we introduce MarioGPT, a fine-tuned GPT2 model trained to generate tile-based game levels, in our case Super Mario Bros levels. MarioGPT can not only generate diverse levels, but can be text-prompted for controllable level generation, addressing one of the key challenges of current PCG techniques. As far as we know, MarioGPT is the first text-to-level model and combined with novelty search it enables the generation of diverse levels with varying play-style dynamics (i.e. player paths) and the open-ended discovery of an increasingly diverse range of content. Code available at https://github.com/shyamsn97/mario-gpt.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Shyam Sudhakaran (14 papers)
  2. Miguel González-Duque (11 papers)
  3. Claire Glanois (12 papers)
  4. Matthias Freiberger (9 papers)
  5. Elias Najarro (11 papers)
  6. Sebastian Risi (77 papers)
Citations (41)

Summary

An Analysis of MarioGPT: Open-Ended Text2Level Generation through LLMs

This paper introduces MarioGPT, a fine-tuned version of the GPT-2 model, designed for generating tile-based game levels with a specific focus on the Super Mario Bros environment. MarioGPT aims to address central challenges in Procedural Content Generation (PCG), notably the creation of levels that are both diverse and aligned with user-defined constraints via natural language prompts. This paper underscores the potential of integrating LLMs with PCG to enhance the controllability and diversity of generated content.

Core Contributions

MarioGPT leverages the structure of GPT-2 to produce levels that can be guided by text prompts. The authors fine-tuned GPT-2 using the VGLC dataset, encoding Mario levels as sequences of tokens. This method allows for the generation of levels responsive to natural language descriptions, such as "many pipes, no enemies, many blocks," demonstrating a novel application of LLMs in controlled PCG.

The paper introduces a noteworthy implementation by pairing MarioGPT with novelty search, a technique aimed at fostering diversity through evolutionary computation. This integration helps explore a wider spectrum of levels, ensuring both operational playability and inventive paths.

Evaluation and Results

The efficacy of MarioGPT is evaluated through various metrics:

  • Tile Prediction Accuracy: The model exhibits superior tile prediction performance relative to LSTM baselines, achieving a 93% success rate in non-air tile prediction.
  • Playability: Approximately 88.4% of levels generated by MarioGPT were playable according to an A* agent, a significant figure compared to traditional methods.
  • Prompt Responsiveness: The model demonstrated a strong ability to adhere to provided text prompts, with an accuracy exceeding 68% across various content features.

The paper also explores the model's potential for diverse level generation in continuous, open-ended settings via novelty search. This method captures the richness of possible player paths, fostering extensive diversity without compromising functionality.

Implications and Future Directions

The introduction of MarioGPT presents significant implications for both PCG research and practical game development. By enabling direct text-to-level generation, the model reduces dependencies on latent space explorations traditionally required in PCGML, thus streamlining the level design process.

The paper suggests several avenues for future research. One potential direction is the enhancement of prompt accuracy and further exploration of guided generation techniques. This could involve expanding training datasets or integrating more sophisticated search methods and human feedback mechanisms. Additionally, augmenting models with reinforcement learning could yield more adaptive content generation based on user engagement.

In conclusion, MarioGPT showcases the tangible benefits of integrating LLMs into PCG, setting a precedent for subsequent exploration in automated game content creation. While there remain areas for improvement, notably in generalization and higher-order diversity, the foundational work laid in this paper provides a robust framework for advancing procedural content generation methodologies.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com