Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Distributed Conditional GAN (discGAN) For Synthetic Healthcare Data Generation (2304.04290v1)

Published 9 Apr 2023 in cs.LG and cs.AI

Abstract: In this paper, we propose a distributed Generative Adversarial Networks (discGANs) to generate synthetic tabular data specific to the healthcare domain. While using GANs to generate images has been well studied, little to no attention has been given to generation of tabular data. Modeling distributions of discrete and continuous tabular data is a non-trivial task with high utility. We applied discGAN to model non-Gaussian multi-modal healthcare data. We generated 249,000 synthetic records from original 2,027 eICU dataset. We evaluated the performance of the model using machine learning efficacy, the Kolmogorov-Smirnov (KS) test for continuous variables and chi-squared test for discrete variables. Our results show that discGAN was able to generate data with distributions similar to the real data.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. David Fuentes (21 papers)
  2. Diana McSpadden (7 papers)
  3. Sodiq Adewole (10 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.