Papers
Topics
Authors
Recent
2000 character limit reached

Not All the Same: Understanding and Informing Similarity Estimation in Tile-Based Video Games (2402.18728v1)

Published 28 Feb 2024 in cs.HC

Abstract: Similarity estimation is essential for many game AI applications, from the procedural generation of distinct assets to automated exploration with game-playing agents. While similarity metrics often substitute human evaluation, their alignment with our judgement is unclear. Consequently, the result of their application can fail human expectations, leading to e.g. unappreciated content or unbelievable agent behaviour. We alleviate this gap through a multi-factorial study of two tile-based games in two representations, where participants (N=456) judged the similarity of level triplets. Based on this data, we construct domain-specific perceptual spaces, encoding similarity-relevant attributes. We compare 12 metrics to these spaces and evaluate their approximation quality through several quantitative lenses. Moreover, we conduct a qualitative labelling study to identify the features underlying the human similarity judgement in this popular genre. Our findings inform the selection of existing metrics and highlight requirements for the design of new similarity metrics benefiting game development and research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. Generalized Non-metric Multidimensional Scaling. In Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 2). PMLR.
  2. Assessing Aesthetic Criteria in the Evolutionary Dungeon Designer. In Proceedings of the 13th International Conference on the Foundations of Digital Games (FDG ’18). Association for Computing Machinery, New York, NY, USA.
  3. Yoav Benjamini and Yosef Hochberg. 1995. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society: Series B (Methodological) 57, 1 (1995), 289–300.
  4. Gerard R Bentley and Joseph C Osborn. 2019. The Videogame Affordances Corpus. In AIIDE Workshop on Experimental AI in Games (EXAG).
  5. Lode Enhancer: Level Co-creation Through Scaling. In Proceedings of the 18th International Conference on the Foundations of Digital Games.
  6. Alessandro Canossa and Gillian Smith. 2015. Towards a procedural evaluation technique: Metrics for level design. In The 10th International Conference on the Foundations of Digital Games. sn, 8.
  7. Kate Compton and Michael Mateas. 2015. Casual Creators.. In Proceedings of International Conference on Computational Creativity. 228–235.
  8. Danesh: Interactive tools for understanding procedural content generators. IEEE Transactions on Games 14, 3 (2021), 329–338.
  9. Learning Perceptual Kernels for Visualization Design. IEEE Transactions on Visualization and Computer Graphics 20, 12 (2014).
  10. Search-Based Exploration and Diagnosis of TOAD-GAN. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Vol. 17. 140–147.
  11. Gustav Theodor Fechner. 1860. Elemente der Psychophysik. Vol. 2. Breitkopf u. Härtel.
  12. Illuminating Mario Scenes in the Latent Space of a Generative Adversarial Network. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 5922–5930.
  13. DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data. In Advances in Neural Information Processing Systems, Vol. 36.
  14. Jason Grinblat and Charles Brian Bucklew. 2010. Caves of Qud.
  15. Predicting player experience without the player. an exploratory study. In Proceedings of the Annual Symposium on Computer-Human Interaction in Play. 305–315.
  16. A Similarity Measure for Material Appearance. ACM Trans. Graph. 38, 4, Article 135 (2019), 12 pages.
  17. The Similarity Metric. IEEE Transactions on Information Theory 50, 12 (2004).
  18. Simon M. Lucas and Vanessa Volz. 2019. Tile Pattern KL-divergence for Analysing and Evolving Game Levels. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ’19). Association for Computing Machinery, 170–178.
  19. Emanuel Maiberg. 2016. ‘No Man’s Sky’ Is Like 18 Quintillion Bowls of Oatmeal. https://www.vice.com/en/article/nz7d8q/no-mans-sky-review. [Online; accessed 27-November-2023].
  20. An empirical evaluation of evaluation metrics of procedurally generated Mario levels. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Vol. 11. 44–50.
  21. Budiman Minasny and Alex B. McBratney. 2006. A Conditioned Latin Hypercube Method for Sampling in the Presence of Ancillary Information. Computers & Geosciences 32, 9 (2006).
  22. Perception-aware modeling and fabrication of digital drawing tools. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1–15.
  23. An Interaction-Aware, Perceptual Model for Non-Linear Elastic Objects. ACM Trans. Graph. 35, 4, Article 55 (jul 2016).
  24. Robert Gilmore Pontius and Marco Millones. 2011. Death to Kappa: Birth of Quantity Disagreement and Allocation Disagreement for Accuracy Assessment. International Journal of Remote Sensing 32, 15 (2011).
  25. Younès Rabii and Michael Cook. 2023. Why Oatmeal is Cheap: Kolmogorov Complexity and Procedural Generation. In Proceedings of the 18th International Conference on the Foundations of Digital Games. 1–7.
  26. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning. PMLR.
  27. Bootstrapping Conditional GANs for Video Game Level Generation. In 2020 IEEE Conference on Games (CoG). 41–48.
  28. Perceptual image similarity experiments. In Human Vision and Electronic Imaging III, Vol. 3299. SPIE, 576–590.
  29. Anurag Sarkar and Seth Cooper. 2022. tile2tile: Learning Game Filters for Platformer Style Transfer. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Vol. 18. 53–60.
  30. Evolving Levels for Super Mario Bros Using Grammatical Evolution. In 2012 IEEE Conference on Computational Intelligence and Games (CIG).
  31. A Low-Dimensional Perceptual Space for Intuitive BRDF Editing. In EGSR 2021-Eurographics Symposium on Rendering-DL-only Track. 1–13.
  32. Julius Sim and Chris C Wright. 2005. The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements. Physical Therapy 85, 3 (2005).
  33. Gillian Smith and Jim Whitehead. 2010. Analyzing the Expressive Range of a Level Generator. In Proceedings of the 2010 Workshop on Procedural Content Generation in Games (PCGames ’10). Association for Computing Machinery, New York, NY, USA, Article 4.
  34. Bad North.(2018). Plausible Concept (2018).
  35. Adam Summerville. 2018. Expanding expressive range: Evaluation methodologies for procedural content generation. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Vol. 14. 116–122.
  36. Understanding Mario: An Evaluation of Design Metrics for Platformers. In Proceedings of the 12th International Conference on the Foundations of Digital Games. Association for Computing Machinery.
  37. Procedural Content Generation via Machine Learning (PCGML). IEEE Transactions on Games 10, 3 (2018), 257–270.
  38. The VGLC: The Video Game Level Corpus. In Proceedings of the 7th Workshop on Procedural Content Generation. arXiv:1606.07487
  39. Level Generation Through Large Language Models. In Proceedings of the 18th International Conference on the Foundations of Digital Games (FDG ’23). Association for Computing Machinery, New York, NY, USA, Article 70, 8 pages.
  40. Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data Using T-SNE. Journal of Machine Learning Research 9, 86 (2008), 2579–2605.
  41. Laurens van der Maaten and Kilian Weinberger. 2012. Stochastic Triplet Embedding. In 2012 IEEE International Workshop on Machine Learning for Signal Processing.
  42. Vanessa Volz. 2019. Uncertainty Handling in Surrogate Assisted Optimisation of Games. Ph. D. Dissertation. Technische Universität Dortmund.
  43. Capturing Local and Global Patterns in Procedural Content Generation via Machine Learning. In 2020 IEEE Conference on Games (CoG).
  44. Statistical Significance Testing at CHI PLAY: Challenges and Opportunities for More Transparency. In Proceedings of the Annual Symposium on Computer-Human Interaction in Play (CHI PLAY ’20). 4–18.
  45. Toward a perceptual space for gloss. ACM Transactions on graphics (TOG) 28, 4 (2009), 1–15.
  46. Oliver Withington and Laurissa Tokarchuk. 2023. The Right Variety: Improving Expressive Range Analysis with Metric Selection Methods. In Proceedings of the 18th International Conference on the Foundations of Digital Games (FDG ’23). Association for Computing Machinery, New York, NY, USA, Article 18, 11 pages.
  47. Bang Wong. 2011. Points of View: Color Blindness. Nature Methods 8, 6 (2011).
  48. Georgios N Yannakakis and Julian Togelius. 2011. Experience-driven procedural content generation. IEEE Transactions on Affective Computing 2, 3 (2011), 147–161.
  49. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 586–595.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 12 likes about this paper.