Humans, Machine Learning, and Language Models in Union: A Cognitive Study on Table Unionability (2506.12990v1)
Abstract: Data discovery and table unionability in particular became key tasks in modern Data Science. However, the human perspective for these tasks is still under-explored. Thus, this research investigates the human behavior in determining table unionability within data discovery. We have designed an experimental survey and conducted a comprehensive analysis, in which we assess human decision-making for table unionability. We use the observations from the analysis to develop a machine learning framework to boost the (raw) performance of humans. Furthermore, we perform a preliminary study on how LLM performance is compared to humans indicating that it is typically better to consider a combination of both. We believe that this work lays the foundations for developing future Human-in-the-Loop systems for efficient data discovery.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.