Seeing is Believing: Vision-driven Non-crash Functional Bug Detection for Mobile Apps (2407.03037v2)

Published 3 Jul 2024 in cs.SE

Abstract: Mobile app GUI (Graphical User Interface) pages now contain rich visual information, with the visual semantics of each page helping users understand the application logic. However, these complex visual and functional logic present new challenges to software testing. Existing automated GUI testing methods, constrained by the lack of reliable testing oracles, are limited to detecting crash bugs with obvious abnormal signals. Consequently, many non-crash functional bugs, ranging from unexpected behaviors to logical errors, often evade detection by current techniques. While these non-crash functional bugs can exhibit visual cues that serve as potential testing oracles, they often entail a sequence of screenshots, and detecting them necessitates an understanding of the operational logic among GUI page transitions, which is challenging traditional techniques. Considering the remarkable performance of Multimodal LLMs (MLLM) in visual and language understanding, this paper proposes Trident, a novel vision-driven, multi-agent collaborative automated GUI testing approach for detecting non-crash functional bugs. It comprises three agents: Explorer, Monitor, and Detector, to guide the exploration, oversee the testing progress, and spot issues. We also address several challenges, i.e., align visual and textual information for MLLM input, achieve functionality-oriented exploration, and infer test oracles for non-crash bugs, to enhance the performance of functionality bug detection. We evaluate Trident on 590 non-crash bugs and compare it with 12 baselines, it can achieve more than 14%-112% and 108%-147% boost in average recall and precision compared with the best baseline. The ablation study further proves the contribution of each module. Moreover, Trident identifies 43 new bugs on Google Play, of which 31 have been fixed.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (60)

Authors (9)

Zhe Liu (234 papers)
Cheng Li (1094 papers)
Chunyang Chen (86 papers)
Junjie Wang (164 papers)
Boyu Wu (8 papers)
Yawen Wang (11 papers)
Jun Hu (239 papers)
Qing Wang (341 papers)
Mengzhuo Chen (5 papers)

Citations (1)

View on Semantic Scholar

Tweets

https://twitter.com/ComputerPapers/status/1864670446672605420

Seeing is Believing: Vision-driven Non-crash Functional Bug Detection for Mobile Apps (2407.03037v2)

Related Papers

Tweets