Papers
Topics
Authors
Recent
Search
2000 character limit reached

Unfolding Data Quality Dimensions in Practice: A Survey

Published 23 Jul 2025 in cs.DB | (2507.17507v1)

Abstract: Data quality describes the degree to which data meet specific requirements and are fit for use by humans and/or downstream tasks (e.g., artificial intelligence). Data quality can be assessed across multiple high-level concepts called dimensions, such as accuracy, completeness, consistency, or timeliness. While extensive research and several attempts for standardization (e.g., ISO/IEC 25012) exist for data quality dimensions, their practical application often remains unclear. In parallel to research endeavors, a large number of tools have been developed that implement functionalities for the detection and mitigation of specific data quality issues, such as missing values or outliers. With this paper, we aim to bridge this gap between data quality theory and practice by systematically connecting low-level functionalities offered by data quality tools with high-level dimensions, revealing their many-to-many relationships. Through an examination of seven open-source data quality tools, we provide a comprehensive mapping between their functionalities and the data quality dimensions, demonstrating how individual functionalities and their variants partially contribute to the assessment of single dimensions. This systematic survey provides both practitioners and researchers with a unified view on the fragmented landscape of data quality checks, offering actionable insights for quality assessment across multiple dimensions.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.