Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Human-AI Collaboration in Data Science: Exploring Data Scientists' Perceptions of Automated AI (1909.02309v1)

Published 5 Sep 2019 in cs.HC, cs.AI, and cs.LG

Abstract: The rapid advancement of AI is changing our lives in many ways. One application domain is data science. New techniques in automating the creation of AI, known as AutoAI or AutoML, aim to automate the work practices of data scientists. AutoAI systems are capable of autonomously ingesting and pre-processing data, engineering new features, and creating and scoring models based on a target objectives (e.g. accuracy or run-time efficiency). Though not yet widely adopted, we are interested in understanding how AutoAI will impact the practice of data science. We conducted interviews with 20 data scientists who work at a large, multinational technology company and practice data science in various business settings. Our goal is to understand their current work practices and how these practices might change with AutoAI. Reactions were mixed: while informants expressed concerns about the trend of automating their jobs, they also strongly felt it was inevitable. Despite these concerns, they remained optimistic about their future job security due to a view that the future of data science work will be a collaboration between humans and AI systems, in which both automation and human expertise are indispensable.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Dakuo Wang (87 papers)
  2. Justin D. Weisz (26 papers)
  3. Michael Muller (70 papers)
  4. Parikshit Ram (43 papers)
  5. Werner Geyer (20 papers)
  6. Casey Dugan (12 papers)
  7. Yla Tausczik (1 paper)
  8. Horst Samulowitz (29 papers)
  9. Alexander Gray (35 papers)
Citations (286)

Summary

  • The paper demonstrates that AutoAI accelerates data exploration and modeling tasks, enhancing workflow efficiency.
  • The paper uses semi-structured interviews with 20 diverse data scientists to reveal mixed perceptions of automation and skill dilution.
  • The paper implies that AutoAI acts as both a collaborator and educator, reshaping data science practices while preserving core human expertise.

An Analysis of Human-AI Collaboration in Data Science

The paper under review, "Human-AI Collaboration in Data Science: Exploring Data Scientists' Perceptions of Automated AI," was presented at the ACM Symposium on Neural Gaze Detection. Authored by a team from IBM Research and the University of Maryland, this research explores the burgeoning field of AutoAI, examining its potential impact on data science professionals.

Overview and Key Findings

AutoAI, a subset of AutoML, refers to systems capable of automating various stages of the data science workflow, such as data ingestion, pre-processing, feature engineering, and model creation. Though the adoption of such systems is in its early stages, this research paper provides a pioneering exploration of data scientists' perceptions of AutoAI, focusing on how it might integrate into or alter their current work practices.

Through qualitative methods, specifically semi-structured interviews with 20 data scientists from a significant technology company, this paper reveals mixed sentiments. While there is an acknowledgment of the convenience and inevitability of increased automation, concerns persist about the dilution of the foundational skills needed for rigorous data science.

One of the central findings is the potential of AutoAI to accelerate initial data exploration and modeling tasks, thus setting a productive starting point for data scientists. The automation of routine and labor-intensive tasks could lead to an improved efficiency in addressing the high demand for data science insights.

Methodological Reflection

The methodology involved eliciting reflections from diverse data scientists who undertook different roles across various domains. The authors are meticulous in emphasizing the diversity within their subjects' professional settings, which span across healthcare, telecommunications, insurance, and beyond. This diversity enables a comprehensive understanding of how AutoAI might intersect with distinct industrial data challenges.

Implications for Data Science Practices

The synthesis of the results indicates several potential directions for the future of data science. The main implication is that AutoAI may serve to augment rather than entirely replace the human data scientist, thus reshaping rather than eliminating their role. A collaboration model is envisaged wherein data scientists engage with AutoAI to streamline labor-intensive tasks while preserving the interpretative and judgmental components of data science.

There are potential roles for AutoAI beyond mere tool automation: acting as a collaborator or tutor. As a collaborator, it could enhance productivity by automating mundane aspects of the data science workflow, thereby allowing data scientists to focus on more complex analytical tasks. As a tutor, AutoAI might expose novices to best practices in model selection and data handling, thus serving an educational function for aspiring data scientists.

Future Directions

The research points to several future directions. The role of explainability in establishing trust in AutoAI systems is paramount. Further studies could develop frameworks for how AutoAI systems explain their outputs to human users, thereby enhancing transparency and trust. Moreover, as more stakeholders, like executives and non-data scientist roles, increasingly engage with these systems, understanding how they perceive and utilize AutoAI will further inform its design and implementation.

The augmentation vs. automation debate remains critical in shaping the future roles of data scientists and the capabilities desired in AutoAI systems. These systems may need to balance between automating routine tasks and augmenting the cognitive capacities of data scientists for more strategic tasks.

Conclusion

This paper makes a significant contribution to understanding the current and potential impact of AutoAI on the data science landscape. It sheds light on the nuanced views within the data science community regarding automation and the evolving role of data professionals. As AutoAI technologies mature, their integration into daily data science practices will likely reshape team dynamics and collaborative paradigms in analytics-driven enterprises, heralding a new era of human-AI collaboration.