Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Imputation of Missing Data with Class Imbalance using Conditional Generative Adversarial Networks (2012.00220v1)

Published 1 Dec 2020 in cs.LG

Abstract: Missing data is a common problem faced with real-world datasets. Imputation is a widely used technique to estimate the missing data. State-of-the-art imputation approaches, such as Generative Adversarial Imputation Nets (GAIN), model the distribution of observed data to approximate the missing values. Such an approach usually models a single distribution for the entire dataset, which overlooks the class-specific characteristics of the data. Class-specific characteristics are especially useful when there is a class imbalance. We propose a new method for imputing missing data based on its class-specific characteristics by adapting the popular Conditional Generative Adversarial Networks (CGAN). Our Conditional Generative Adversarial Imputation Network (CGAIN) imputes the missing data using class-specific distributions, which can produce the best estimates for the missing values. We tested our approach on benchmark datasets and achieved superior performance compared with the state-of-the-art and popular imputation approaches.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Saqib Ejaz Awan (1 paper)
  2. Mohammed Bennamoun (124 papers)
  3. Ferdous Sohel (35 papers)
  4. Girish Dwivedi (10 papers)
  5. Frank M Sanfilippo (2 papers)
Citations (54)

Summary

We haven't generated a summary for this paper yet.