Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An evaluation framework for synthetic data generation models (2404.08866v1)

Published 13 Apr 2024 in cs.LG and cs.AI

Abstract: Nowadays, the use of synthetic data has gained popularity as a cost-efficient strategy for enhancing data augmentation for improving machine learning models performance as well as addressing concerns related to sensitive data privacy. Therefore, the necessity of ensuring quality of generated synthetic data, in terms of accurate representation of real data, consists of primary importance. In this work, we present a new framework for evaluating synthetic data generation models' ability for developing high-quality synthetic data. The proposed approach is able to provide strong statistical and theoretical information about the evaluation framework and the compared models' ranking. Two use case scenarios demonstrate the applicability of the proposed framework for evaluating the ability of synthetic data generation models to generated high quality data. The implementation code can be found in https://github.com/novelcore/synthetic_data_evaluation_framework.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Ioannis E. Livieris (3 papers)
  2. Nikos Alimpertis (1 paper)
  3. George Domalis (3 papers)
  4. Dimitris Tsakalidis (4 papers)
Citations (4)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets