From Pixels to Titles: Video Game Identification by Screenshots using Convolutional Neural Networks (2311.15963v3)

Published 27 Nov 2023 in cs.CV and cs.NE

Abstract: This paper investigates video game identification through single screenshots, utilizing ten convolutional neural network (CNN) architectures (VGG16, ResNet50, ResNet152, MobileNet, DenseNet169, DenseNet201, EfficientNetB0, EfficientNetB2, EfficientNetB3, and EfficientNetV2S) and three transformers architectures (ViT-B16, ViT-L32, and SwinT) across 22 home console systems, spanning from Atari 2600 to PlayStation 5, totalling 8,796 games and 170,881 screenshots. Except for VGG16, all CNNs outperformed the transformers in this task. Using ImageNet pre-trained weights as initial weights, EfficientNetV2S achieves the highest average accuracy (77.44%) and the highest accuracy in 16 of the 22 systems. DenseNet201 is the best in four systems and EfficientNetB3 is the best in the remaining two systems. Employing alternative initial weights fine-tuned in an arcade screenshots dataset boosts accuracy for EfficientNet architectures, with the EfficientNetV2S reaching a peak accuracy of 77.63% and demonstrating reduced convergence epochs from 26.9 to 24.5 on average. Overall, the combination of optimal architecture and weights attains 78.79% accuracy, primarily led by EfficientNetV2S in 15 systems. These findings underscore the efficacy of CNNs in video game identification through screenshots.

References (24)

Summary

The paper introduces a novel method using CNNs to extract features and identify video game titles from single screenshots.
It evaluates five architectures, with EfficientNetB3 achieving the highest accuracy across 22 systems.
Transfer learning from pre-trained weights significantly improved accuracy and reduced training epochs, highlighting practical benefits.

Introduction

The field of automated video game identification has gained traction due to its technical challenges and practical applications across various sectors within the gaming industry. It allows platforms to generate metadata from user-uploaded screenshots, improves cataloging efficiency, and enhances viewers’ experience on streaming platforms. Traditional game classification methods have mostly focused on genre, but this research shifts the focus to the identification of video game titles from single screenshots using Convolutional Neural Networks (CNNs).

Methodology

The research employs five CNN architectures: MobileNet, DenseNet, EfficientNetB0, EfficientNetB2, and EfficientNetB3, training them on 170,881 screenshots from 8,796 games across 22 home console systems. The hypothesis is that CNNs can autonomously extract image features to identify game titles without additional inputs. The dataset, sourced from the Moby Games Database, spans from the Atari 2600 to the PlayStation 5. The paper uses transfer learning, starting with pretrained weights from the ImageNet dataset, and for some architectures, it compares these results with weights from another dataset of arcade game screenshots.

Results

EfficientNetB3 architecture achieves the highest average accuracy of 74.51%, slightly outperforming DenseNet169 which excels in 14 out of 22 systems. Furthermore, when initial weights from an arcade game dataset are used, there's an increase in accuracy and a reduction in training epochs for EfficientNetB2 and EfficientNetB3, with the latter peaking at 76.36% accuracy. The optimal combination of architecture and weights leads to an overall accuracy of 77.67%, with EfficientNetB3 leading in 19 out of 22 systems.

Conclusions

The efficacy of CNNs in video game identification from screenshots is affirmed by this paper. While larger networks provide better accuracy, future research could investigate even larger CNN architectures and ensembles to enhance results. Transferring weights from related tasks has been shown to improve accuracy, highlighting the potential for CNNs in other screenshot-based applications within the gaming sector. This advancement in AI tech sets a strong foundation for automation in game libraries and streaming platforms, adding innovation and efficiency to gaming and research endeavors.

PDF Markdown

Related Papers

GitHub

GitHub - fbreve/videogame: From Pixels to Titles: Video Game Identification by Screenshots using Convolutional Neural Networks (3 stars)