Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MSC: A Dataset for Macro-Management in StarCraft II (1710.03131v3)

Published 9 Oct 2017 in cs.AI

Abstract: Macro-management is an important problem in StarCraft, which has been studied for a long time. Various datasets together with assorted methods have been proposed in the last few years. But these datasets have some defects for boosting the academic and industrial research: 1) There're neither standard preprocessing, parsing and feature extraction procedures nor predefined training, validation and test set in some datasets. 2) Some datasets are only specified for certain tasks in macro-management. 3) Some datasets are either too small or don't have enough labeled data for modern machine learning algorithms such as deep neural networks. So most previous methods are trained with various features, evaluated on different test sets from the same or different datasets, making it difficult to be compared directly. To boost the research of macro-management in StarCraft, we release a new dataset MSC based on the platform SC2LE. MSC consists of well-designed feature vectors, pre-defined high-level actions and final result of each match. We also split MSC into training, validation and test set for the convenience of evaluation and comparison. Besides the dataset, we propose a baseline model and present initial baseline results for global state evaluation and build order prediction, which are two of the key tasks in macro-management. Various downstream tasks and analyses of the dataset are also described for the sake of research on macro-management in StarCraft II. Homepage: https://github.com/wuhuikai/MSC.

Citations (15)

Summary

  • The paper presents the MSC dataset derived from over 36,000 professional StarCraft II replays to enable advanced macro-management research.
  • The paper details a robust PySC2-based feature extraction process that generates both global feature vectors and spatial tensors for machine learning applications.
  • The paper benchmarks baseline models with global state evaluation at 61.1% accuracy and build order prediction at 74.1% accuracy, setting key performance standards.

Insights into the MSC Dataset for Macro-Management in StarCraft II

This paper introduces the MSC dataset, a novel resource aimed at advancing research in macro-management tasks within the complex environment of StarCraft II. StarCraft II presents unique challenges to AI research due to its large state and action space, partial observability, and the necessity for both micro- and macro-management strategies. The MSC dataset is particularly designed to address macro-management, which encompasses high-level strategic gameplay, such as build order prediction and global state evaluation.

Contributions and Methodology

  1. Dataset Construction: The MSC dataset is derived from SC2LE, containing 36,619 high-quality replays. These replays are rigorously preprocessed to ensure only professional-standard matches are included. A standard feature extraction pipeline is implemented using PySC2, generating both global feature vectors and spatial feature tensors from game replays, thus providing rich and structured data suitable for machine learning applications.
  2. Dataset Characteristics: MSC includes predefined actions, feature-action pairs, and comprehensive division into training, validation, and test sets, which are critical for consistent performance evaluation across different methodologies. The dataset also encompasses observations pertaining to both the player's own and enemy units, emphasizing the partial observability inherent in StarCraft II.
  3. Baseline Models: The authors propose baseline models for two pivotal tasks: global state evaluation and build order prediction. For global state evaluation, the task is to predict the likelihood of game victory based on current observations, utilizing RNN architectures to model the time series nature of gameplay. For build order prediction, which involves determining the next strategic move, the authors evaluate performance through a top-1 accuracy metric, providing essential benchmarks for future work.

Experimentation and Results

The paper discusses initial baselines, with global state evaluation models achieving up to 61.1% accuracy in test scenarios, and build order prediction models reaching a mean accuracy of 74.1%. These results highlight the complexity of macro-strategic decision-making in StarCraft II and set a foundation for further research into improving these metrics.

Implications for AI Research

The creation of the MSC dataset provides a unified platform for evaluating AI algorithms in RTS games, enabling better comparison and benchmarking. This dataset addresses past challenges, including non-standardized preprocessing and inadequate dataset sizes, thus offering a robust foundation for developing sophisticated models. The methodologies leveraged can also be applicable to other domains requiring sequential decision-making under uncertainty.

The implications extend into AI planning, reinforcement learning, and uncertainty modeling. The dataset's comprehensive design allows exploration into generative models and inverse reinforcement learning, given the sparse nature of game rewards and the partial observability of game states. Additionally, the MSC dataset facilitates the integration and assessment of tree search techniques within the RTS games framework.

Future Directions

Interrogating the MSC dataset may lead to advancements in hierarchical learning frameworks that segregate micro- from macro-management, further refining AI's strategic capabilities. Researchers are encouraged to use the MSC dataset to evaluate new algorithms and contribute additional benchmarks, fostering an integrative understanding of AI application in real-time strategy games.

In conclusion, the paper provides a substantial contribution to the field of AI by offering a comprehensive dataset for macro-management in StarCraft II, along with foundational baselines for key tasks. This work not only facilitates direct comparison between algorithms but also fuels future exploration in strategic AI domains.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com