Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DroidBot-GPT: GPT-powered UI Automation for Android (2304.07061v5)

Published 14 Apr 2023 in cs.SE and cs.AI

Abstract: This paper introduces DroidBot-GPT, a tool that utilizes GPT-like LLMs to automate the interactions with Android mobile applications. Given a natural language description of a desired task, DroidBot-GPT can automatically generate and execute actions that navigate the app to complete the task. It works by translating the app GUI state information and the available actions on the smartphone screen to natural language prompts and asking the LLM to make a choice of actions. Since the LLM is typically trained on a large amount of data including the how-to manuals of diverse software applications, it has the ability to make reasonable choices of actions based on the provided information. We evaluate DroidBot-GPT with a self-created dataset that contains 33 tasks collected from 17 Android applications spanning 10 categories. It can successfully complete 39.39% of the tasks, and the average partial completion progress is about 66.76%. Given the fact that our method is fully unsupervised (no modification required from both the app and the LLM), we believe there is great potential to enhance automation performance with better app development paradigms and/or custom model training.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Hao Wen (52 papers)
  2. Hongming Wang (4 papers)
  3. Jiaxuan Liu (11 papers)
  4. Yuanchun Li (37 papers)
Citations (25)