Emma

Summary:

  • The paper proposes a new method to generate and fill in missing values in mixed-type tabular data using score-based diffusion and conditional flow matching, utilizing XGBoost, a Gradient-Boosted Tree method.
  • The method has been shown to generate realistic synthetic data and diverse plausible data imputations, often outperforming deep-learning generation methods, and can be trained using CPUs without a GPU.

Tags:

Research