Attribution of reported ML gains to model design versus data-generation choices
Determine whether the performance improvements reported for machine-learning methods in steady-state transmission grid analysis are attributable to the models’ architectural/design choices or to the dataset generation assumptions (including loads, generator dispatch, and topology).
Sponsor
References
Limited incentives to generate diverse or complex scenarios also make it unclear whether reported gains arise from model design or data choices.
— gridfm-datakit-v1: A Python Library for Scalable and Realistic Power Flow and Optimal Power Flow Data Generation
(2512.14658 - Puech et al., 16 Dec 2025) in Motivations, bullet “Lack of reproducibility and benchmarking” (Section 2)