The paper proposes a new method to generate and fill in missing values in mixed-type tabular data using score-based diffusion and conditional flow matching, utilizing XGBoost, a Gradient-Boosted Tree method.
The method has been shown to generate realistic synthetic data and diverse plausible data imputations, often outperforming deep-learning generation methods, and can be trained using CPUs without a GPU.