Papers
Topics
Authors
Recent
Search
2000 character limit reached

Demo Paper: A Game Agents Battle Driven by Free-Form Text Commands Using Code-Generation LLM

Published 20 May 2024 in cs.HC | (2405.11835v1)

Abstract: This paper presents a demonstration of our monster battle game, in which the game agents fight in accordance with their player's language commands. The commands were translated into the knowledge expression called behavior branches by a code-generation LLM. This work facilitated the design of the commanding system more easily, enabling the game agent to comprehend more various and continuous commands than rule-based methods. The results of the commanding and translation process were stored in a database on an Amazon Web Services server for more comprehensive validation. This implementation would provide a sufficient evaluation of this ongoing work, and give insights to the industry that they could use this to develop their interactive game agents.

Authors (2)
Definition Search Book Streamline Icon: https://streamlinehq.com
References (5)
  1. Hobonichi Co. Ltd., “Pikachu decides that the mushroom in front of him is more important,” Hobo Niccan Itoi Shinbun, Feb. 1999. https://www.1101.com/nintendo/nin3/nin3-2.htm (accessed Apr. 25, 2024).
  2. M. Yoshida, H. Bizen, M. Kambe, and Y. Kawai, “Voice Manipulation System on Virtual Space Using Synonyms,” Proc. of the Inf. Process. Soc. of Jpn. Annu. Conv., Mar. 2021, pp. 137–138.
  3. Q. Mehdi, X. Zeng, and N. Gough, “An interactive speech interface for virtual characters in dynamic environments,” in Proc. ISAS 04, 10th Int. Conf. on Inf. Syst. Anal. and Systhesis, Jul. 2004, pp. 243-248.
  4. D. M. Waqar, T. S. Gunawan, M. Kartiwi and R. Ahmad, “Real-Time Voice-Controlled Game Interaction using Convolutional Neural Networks,” in 2021 IEEE 7th Int. Conf. on Smart Instrum., Meas. and Appl., 2021, pp. 76-81
  5. R. Ito and J. Takahashi, “Game Agent Driven by Free-Form Text Command: Using LLM-based Code Generation and Behavior Branch (in press),” The 38th Annu. Conf. of the Jpn. Soc. for Artif. Intell., May 2024, Available: https://arxiv.org/abs/2402.07442.

Summary

  • The paper presents a system that uses a code-generation LLM to convert natural language commands into executable game behavior branches.
  • The methodology integrates Unity, Photon PUN2, AWS, and the llama-v2-34b-code model to achieve real-time command processing and smooth multiplayer synchronization.
  • The system logs every command and corresponding behavior branch in DynamoDB, enabling detailed performance evaluations and iterative improvements.

Commanding Game Agents Using Natural Language with Code-Generation LLMs

Introduction

The gaming world has seen incredible advancements over the years, but one area that's always posed challenges is enabling more natural interactions between players and game agents. Imagine games where you can simply type what you want your character to do, and they execute your commands flawlessly. This paper presents a novel approach that pushes the boundaries of interactivity by using a code-generation LLM to translate free-form text commands into game actions.

System Overview

Components of the System

The system is made up of several key components that work together to provide a seamless interactive experience:

  • Unity (2022.3.15f1): This tool handles the game environment and graphical interface for players.
  • Photon PUN2 (2.45): It's responsible for real-time network synchronization between the players' game instances, ensuring a smooth multiplayer experience.
  • AWS Server: This backend powerhouse manages player authentication (using Cognito), logs all commands and actions (storing them in DynamoDB), and interfaces with the LLM API to generate the behavior branches from player commands.
  • Fireworks AI API: The 'llama-v2-34b-code' model is employed here for its rapid response time, crucial for maintaining an engaging gameplay experience.

Game Environment

In the game, players control agents in a 3D space. These game agents can perform actions like:

  • Thunderbolt: A ranged attack where the agent shoots a energy sphere at the opponent.
  • Iron Tail: A melee attack involving a powerful tail swing.
  • Tackle: A rushing movement hitting the opponent directly.

Players input commands through a straightforward text interface, and the game pauses momentarily to process these inputs. This ensures that each command is translated and executed accurately.

Command-Action Translation

The translation of text commands into game actions is the heart of this system. The process involves converting player inputs into what's called "behavior branches," which are tree structures chaining conditions and actions. These branches have:

  • Action Nodes: Specify the action to be executed by the game agent.
  • Condition Nodes: Direct the flow based on whether the specified conditions are met.
  • Control Nodes: Manage the execution flow of actions.

This method leverages the structural approach found in programming, allowing for more dynamic and varied behaviors compared to traditional hard-coded algorithms.

Logs and Evaluation

To assess the system's performance, all commands and their subsequent translations are logged in DynamoDB. The logs capture pertinent details like:

  • Session ID
  • Timestamp
  • Player's ID
  • Original Command
  • Translated Behavior Branch

These logs are invaluable for both debugging and iterative improvement, providing detailed insights into how commands are translated and executed.

Demonstration

For practical validation, a live demonstration involves two players engaging in a battle using this system. The rules and commands are explained, and players type commands to control their agents, aiming to defeat their opponent's game agent. This interactive demo helps showcase the flexibility and responsiveness of the system in real-world scenarios.

Conclusions and Future Work

In conclusion, this paper demonstrates the feasibility of using a code-generation LLM to revolutionize player-agent interaction in games. By translating free-form text commands into sophisticated actions, this system offers a glimpse into the future of gaming where natural language could become the primary mode of interaction.

Future work will involve more comprehensive quantitative and qualitative analyses to fine-tune the system further. Enhancements may include reducing latency, expanding the range of possible commands, and improving the natural language understanding capabilities to make this technology even more practical for wider industry adoption.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 0 likes about this paper.