StarCraftImage: A Dataset For Prototyping Spatial Reasoning Methods For Multi-Agent Environments (2401.04290v1)
Abstract: Spatial reasoning tasks in multi-agent environments such as event prediction, agent type identification, or missing data imputation are important for multiple applications (e.g., autonomous surveillance over sensor networks and subtasks for reinforcement learning (RL)). StarCraft II game replays encode intelligent (and adversarial) multi-agent behavior and could provide a testbed for these tasks; however, extracting simple and standardized representations for prototyping these tasks is laborious and hinders reproducibility. In contrast, MNIST and CIFAR10, despite their extreme simplicity, have enabled rapid prototyping and reproducibility of ML methods. Following the simplicity of these datasets, we construct a benchmark spatial reasoning dataset based on StarCraft II replays that exhibit complex multi-agent behaviors, while still being as easy to use as MNIST and CIFAR10. Specifically, we carefully summarize a window of 255 consecutive game states to create 3.6 million summary images from 60,000 replays, including all relevant metadata such as game outcome and player races. We develop three formats of decreasing complexity: Hyperspectral images that include one channel for every unit type (similar to multispectral geospatial images), RGB images that mimic CIFAR10, and grayscale images that mimic MNIST. We show how this dataset can be used for prototyping spatial reasoning methods. All datasets, code for extraction, and code for dataset loading can be found at https://starcraftdata.davidinouye.com
- Flood prediction and disaster risk analysis using gis based wireless sensor networks, a review. Journal of Basic and Applied Scientific Research, 3(8):632–643, 2013.
- Invariant risk minimization. arXiv preprint arXiv:1907.02893, 2019.
- A database and evaluation methodology for optical flow. International journal of computer vision, 92(1):1–31, 2011.
- Integrating pedestrian simulation, tracking and event detection for crowd analysis. In 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pages 150–157. IEEE, 2011.
- Privacy preserving crowd monitoring: Counting people without people models or tracking. In 2008 IEEE conference on computer vision and pattern recognition, pages 1–7. IEEE, 2008.
- P.W.D. Charles. S2clinet-proto repository. https://github.com/Blizzard/s2client-proto/#replay-packs, 2015.
- The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3213–3223, 2016.
- Tao: A large-scale benchmark for tracking any object. In European conference on computer vision, pages 436–454. Springer, 2020.
- Coverage protocols for wireless sensor networks: Review and future directions. Journal of Communications and Networks, 21(1):45–60, 2019.
- Motsynth: How can synthetic data help pedestrian detection and tracking? In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10849–10859, 2021.
- Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE conference on computer vision and pattern recognition, pages 3354–3361. IEEE, 2012.
- Sparsity-based spatial interpolation in wireless sensor networks. Sensors, 11(3):2385–2407, 2011.
- Pose. r: Prediction-based opportunistic sensing for resilient and efficient sensor networks. ACM Transactions on Sensor Networks (TOSN), 17(1):1–41, 2020.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Bag of tricks for image classification with convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 558–567, 2019.
- Fastai: a layered api for deep learning. Information, 11(2):108, 2020.
- Pavel Iakubovskii. Segmentation models pytorch. https://github.com/qubvel/segmentation_models.pytorch, 2019.
- Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <<<0.5mb model size. arXiv:1602.07360, 2016.
- Clevr: A diagnostic dataset for compositional language and elementary visual reasoning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2901–2910, 2017.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
- Wilds: A benchmark of in-the-wild distribution shifts. In International Conference on Machine Learning, pages 5637–5664. PMLR, 2021.
- Philipp Krähenbühl. Free supervision from video games. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2955–2964, 2018.
- Learning multiple layers of features from tiny images. 2009.
- An empirical evaluation of deep architectures on problems with many factors of variation. In Proceedings of the 24th international conference on Machine learning, pages 473–480, 2007.
- Yann LeCun. The mnist database of handwritten digits. http://yann. lecun. com/exdb/mnist/, 1998.
- Geoai for large-scale image analysis and machine vision: Recent progress of artificial intelligence in geography. ISPRS International Journal of Geo-Information, 11(7):385, 2022.
- Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755. Springer, 2014.
- Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
- Prediction of sorghum biomass using time series uav-based hyperspectral and lidar data. In IGARSS 2020-2020 IEEE International Geoscience and Remote Sensing Symposium, pages 3912–3915. IEEE, 2020.
- Sean Plott. Starcraft ii mental checklist, 2011.
- Dataset shift in machine learning. Mit Press, 2009.
- Large scale high-resolution land cover mapping with multi-resolution data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 12726–12735, 2019.
- U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015.
- One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation, 23(5):828–841, 2019.
- Scalability in perception for autonomous driving: Waymo open dataset. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2446–2454, 2020.
- Starcraft ii: A new challenge for reinforcement learning. arXiv preprint arXiv:1708.04782, 2017.
- Robust object detection under occlusion with context-aware compositionalnets. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12645–12654, 2020.
- Learning from synthetic data for crowd counting in the wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8198–8207, 2019.
- Dota: A large-scale dataset for object detection in aerial images. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
- Segformer: Simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems, 34:12077–12090, 2021.
- Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1492–1500, 2017.
- Lawin transformer: Improving semantic segmentation transformer with multi-scale representations via large window attention. arXiv preprint arXiv:2201.01615, 2022.
- Topology management techniques for tolerating node failures in wireless sensor networks: A survey. Computer networks, 58:254–283, 2014.
- Person re-identification: Past, present and future. arXiv preprint arXiv:1610.02984, 2016.
- Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
- Domain generalization: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.