AutoTask: Executing Arbitrary Voice Commands by Exploring and Learning from Mobile GUI (2312.16062v1)
Abstract: Voice command interfaces (VCIs) have gained increasing importance, enabling hands-free and eyes-free interaction with digital devices. However, the inherent complexity in constructing effective voice interfaces has limited the VCIs' functionalities to only a small fraction of GUI applications and tasks. This paper presents AutoTask, a VCI capable of automating any task in any mobile application without configuration or modification from developers or end users. The primary challenge for AutoTask is the lack of knowledge, as it needs to accomplish unknown tasks (e.g., user commands) within an unknown environment (e.g., GUI). To address this challenge, AutoTask employs two strategies: (1) trial and error: AutoTask explores the GUI, attempts potential operation sequences, and recovers from errors through backtracking; (2) learning from the environment: AutoTask accumulates experiences during exploration and summarizes correct knowledge from these experiences. We implemented AutoTask on Android devices and conducted an evaluation study, which proved the feasibility of AutoTask.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.