Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

How Much Automation Does a Data Scientist Want? (2101.03970v1)

Published 7 Jan 2021 in cs.LG and cs.HC

Abstract: Data science and machine learning (DS/ML) are at the heart of the recent advancements of many AI applications. There is an active research thread in AI, \autoai, that aims to develop systems for automating end-to-end the DS/ML Lifecycle. However, do DS and ML workers really want to automate their DS/ML workflow? To answer this question, we first synthesize a human-centered AutoML framework with 6 User Role/Personas, 10 Stages and 43 Sub-Tasks, 5 Levels of Automation, and 5 Types of Explanation, through reviewing research literature and marketing reports. Secondly, we use the framework to guide the design of an online survey study with 217 DS/ML workers who had varying degrees of experience, and different user roles "matching" to our 6 roles/personas. We found that different user personas participated in distinct stages of the lifecycle -- but not all stages. Their desired levels of automation and types of explanation for AutoML also varied significantly depending on the DS/ML stage and the user persona. Based on the survey results, we argue there is no rationale from user needs for complete automation of the end-to-end DS/ML lifecycle. We propose new next steps for user-controlled DS/ML automation.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Dakuo Wang (87 papers)
  2. Q. Vera Liao (49 papers)
  3. Yunfeng Zhang (45 papers)
  4. Udayan Khurana (10 papers)
  5. Horst Samulowitz (29 papers)
  6. Soya Park (8 papers)
  7. Michael Muller (70 papers)
  8. Lisa Amini (7 papers)
Citations (48)

Summary

We haven't generated a summary for this paper yet.