Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 89 tok/s
Gemini 2.5 Flash 155 tok/s Pro
Gemini 2.5 Pro 51 tok/s Pro
Kimi K2 209 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

A new PCA-based utility measure for synthetic data evaluation (2212.05595v1)

Published 26 Nov 2022 in cs.DB

Abstract: Data synthesis is a privacy enhancing technology aiming to produce realistic and timely data when real data is hard to obtain. Utility of synthetic data generators (SDGs) has been investigated through different utility metrics. These metrics have been found to generate conflicting conclusions making direct comparison of SDGs surprisingly difficult. Moreover, prior research found no correlation between popular metrics, concluding they tackle different utility-dimensions. This paper aggregates four popular utility metrics (representing different utility dimensions) into one using principal-component-analysis and checks whether the new measure can generate synthetic data that perform well in real-life. The new measure is used to compare four well-recognized SDGs.

Citations (2)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.