Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large Language Models are Pretty Good Zero-Shot Video Game Bug Detectors (2210.02506v1)

Published 5 Oct 2022 in cs.CL and cs.SE

Abstract: Video game testing requires game-specific knowledge as well as common sense reasoning about the events in the game. While AI-driven agents can satisfy the first requirement, it is not yet possible to meet the second requirement automatically. Therefore, video game testing often still relies on manual testing, and human testers are required to play the game thoroughly to detect bugs. As a result, it is challenging to fully automate game testing. In this study, we explore the possibility of leveraging the zero-shot capabilities of LLMs for video game bug detection. By formulating the bug detection problem as a question-answering task, we show that LLMs can identify which event is buggy in a sequence of textual descriptions of events from a game. To this end, we introduce the GameBugDescriptions benchmark dataset, which consists of 167 buggy gameplay videos and a total of 334 question-answer pairs across 8 games. We extensively evaluate the performance of six models across the OPT and InstructGPT LLM families on our benchmark dataset. Our results show promising results for employing LLMs to detect video game bugs. With the proper prompting technique, we could achieve an accuracy of 70.66%, and on some video games, up to 78.94%. Our code, evaluation data and the benchmark can be found on https://asgaardlab.github.io/LLMxBugs

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Mohammad Reza Taesiri (17 papers)
  2. Finlay Macklon (5 papers)
  3. Yihe Wang (12 papers)
  4. Hengshuo Shen (1 paper)
  5. Cor-Paul Bezemer (24 papers)
Citations (12)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com