How Much Automation Does a Data Scientist Want? (2101.03970v1)
Abstract: Data science and machine learning (DS/ML) are at the heart of the recent advancements of many AI applications. There is an active research thread in AI, \autoai, that aims to develop systems for automating end-to-end the DS/ML Lifecycle. However, do DS and ML workers really want to automate their DS/ML workflow? To answer this question, we first synthesize a human-centered AutoML framework with 6 User Role/Personas, 10 Stages and 43 Sub-Tasks, 5 Levels of Automation, and 5 Types of Explanation, through reviewing research literature and marketing reports. Secondly, we use the framework to guide the design of an online survey study with 217 DS/ML workers who had varying degrees of experience, and different user roles "matching" to our 6 roles/personas. We found that different user personas participated in distinct stages of the lifecycle -- but not all stages. Their desired levels of automation and types of explanation for AutoML also varied significantly depending on the DS/ML stage and the user persona. Based on the survey results, we argue there is no rationale from user needs for complete automation of the end-to-end DS/ML lifecycle. We propose new next steps for user-controlled DS/ML automation.
- Dakuo Wang (87 papers)
- Q. Vera Liao (49 papers)
- Yunfeng Zhang (45 papers)
- Udayan Khurana (10 papers)
- Horst Samulowitz (29 papers)
- Soya Park (8 papers)
- Michael Muller (70 papers)
- Lisa Amini (7 papers)