Computational trade-offs in synthetic data generation and use
Determine optimal strategies for balancing synthetic data quality and quantity against computational cost to maximize downstream statistical utility, including guidance on how many synthetic samples to generate and how to weight them relative to real data.
References
Understanding how to optimally balance data quality, quantity, and computational cost largely remains an open and practically relevant problem.
— Harnessing Synthetic Data from Generative AI for Statistical Inference
(2603.05396 - Abdel-Azim et al., 5 Mar 2026) in Section 4, Computational Considerations