Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AURORA: Navigating UI Tarpits via Automated Neural Screen Understanding (2404.01240v1)

Published 1 Apr 2024 in cs.SE, cs.CL, cs.CV, and cs.HC

Abstract: Nearly a decade of research in software engineering has focused on automating mobile app testing to help engineers in overcoming the unique challenges associated with the software platform. Much of this work has come in the form of Automated Input Generation tools (AIG tools) that dynamically explore app screens. However, such tools have repeatedly been demonstrated to achieve lower-than-expected code coverage - particularly on sophisticated proprietary apps. Prior work has illustrated that a primary cause of these coverage deficiencies is related to so-called tarpits, or complex screens that are difficult to navigate. In this paper, we take a critical step toward enabling AIG tools to effectively navigate tarpits during app exploration through a new form of automated semantic screen understanding. We introduce AURORA, a technique that learns from the visual and textual patterns that exist in mobile app UIs to automatically detect common screen designs and navigate them accordingly. The key idea of AURORA is that there are a finite number of mobile app screen designs, albeit with subtle variations, such that the general patterns of different categories of UI designs can be learned. As such, AURORA employs a multi-modal, neural screen classifier that is able to recognize the most common types of UI screen designs. After recognizing a given screen, it then applies a set of flexible and generalizable heuristics to properly navigate the screen. We evaluated AURORA both on a set of 12 apps with known tarpits from prior work, and on a new set of five of the most popular apps from the Google Play store. Our results indicate that AURORA is able to effectively navigate tarpit screens, outperforming prior approaches that avoid tarpits by 19.6% in terms of method coverage. The improvements can be attributed to AURORA's UI design classification and heuristic navigation techniques.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. M. Linares-Vásquez, G. Bavota, C. Bernal-Cárdenas, M. Di Penta, R. Oliveto, and D. Poshyvanyk, “API Change and Fault Proneness: A Threat to the Success of Android Apps,” in ESEC/FSE, 2013.
  2. G. Bavota, M. Linares-Vásquez, C. E. Bernal-Cárdenas, M. D. Penta, R. Oliveto, and D. Poshyvanyk, “The Impact of API Change- and Fault-Proneness on the User Ratings of Android Apps,” TSE, 2015.
  3. F. Palomba, M. Linares-Vásquez, G. Bavota, R. Oliveto, M. Di Penta, D. Poshyvanyk, and A. De Lucia, “User Reviews Matter! Tracking Crowdsourced Reviews to Support Evolution of Successful Apps,” in ICSME, 2015.
  4. “Android and Google Play Statistics, Development Resources and Intelligence,” 2023. [Online]. Available: https://www.appbrain.com/stats
  5. P. S. Kochhar, F. Thung, N. Nagappan, T. Zimmermann, and D. Lo, “Understanding the Test Automation Culture of App Developers,” in ICST, 2015.
  6. O. Chaparro, C. Bernal-Cárdenas, J. Lu, K. Moran, A. Marcus, M. Di Penta, D. Poshyvanyk, and V. Ng, “Assessing the Quality of the Steps to Reproduce in Bug Reports,” in ESEC/FSE, 2019.
  7. J. Mahmud, N. De Silva, S. A. Khan, S. H. Mostafavi, S. M. H. Mansur, O. Chaparro, A. A. Marcus, and K. Moran, “On Using GUI Interaction Data to Improve Text Retrieval-based Bug Localization,” in ICSE, 2024.
  8. Y. Song, J. Mahmud, N. De Silva, Y. Zhou, O. Chaparro, K. Moran, A. Marcus, and D. Poshyvanyk, “Burt: A Chatbot for Interactive Bug Reporting,” in ICSE-Companion, 2023.
  9. Y. Yan, N. Cooper, O. Chaparro, K. Moran, and D. Poshyvanyk, “Semantic GUI Scene Learning and Video Alignment for Detecting Duplicate Video-based Bug Reports,” in ICSE, 2024.
  10. Y. Song, J. Mahmud, Y. Zhou, O. Chaparro, K. Moran, A. Marcus, and D. Poshyvanyk, “Toward Interactive Bug Reporting for Android App End-users,” in ESEC/FSE, 2022.
  11. M. Fazzini, K. Moran, C. Bernal-Cárdenas, T. Wendland, A. Orso, and D. Poshyvanyk, “Enhancing Mobile App Bug Reporting via Real-Time Understanding of Reproduction Steps,” TSE, 2023.
  12. C. Bernal-Cárdenas, N. Cooper, M. Havranek, K. Moran, O. Chaparro, D. Poshyvanyk, and A. Marcus, “Translating Video Recordings of Complex Mobile App UI Gestures into Replayable Scenarios,” TSE, 2023.
  13. J. Johnson, J. Mahmud, T. Wendland, K. Moran, J. Rubin, and M. Fazzini, “An Empirical Investigation into the Reproduction of Bug Reports for Android Apps,” in SANER, 2022.
  14. N. Cooper, C. Bernal-Cárdenas, O. Chaparro, K. Moran, and D. Poshyvanyk, “It Takes Two to Tango: Combining Visual and Textual Information for Detecting Duplicate Video-Based Bug Reports,” in ICSE, 2021.
  15. M. Havranek, C. Bernal-Cárdenas, N. Cooper, O. Chaparro, D. Poshyvanyk, and K. Moran, “V2S: A Tool for Translating Video Recordings of Mobile App Usages into Replayable Scenarios,” in ICSE-Companion, 2021.
  16. C. Bernal-Cárdenas, N. Cooper, K. Moran, O. Chaparro, A. Marcus, and D. Poshyvanyk, “Translating Video Recordings of Mobile App Usages into Replayable Scenarios,” in ICSE, 2020.
  17. K. Moran, C. Watson, J. Hoskins, G. Purnell, and D. Poshyvanyk, “Detecting and Summarizing GUI Changes in Evolving Mobile Apps,” in ASE, 2018.
  18. S. Salma, S. H. Mansur, Y. Zhang, and K. Moran, “GuiEvo: Automated Evolution of Mobile App UIs,” in MSR, 2024.
  19. A. Orso and G. Rothermel, “Software Testing: A Research Travelogue (2000-2014),” in Future of Software Engineering Proceedings, 2014.
  20. “UI/Application Exerciser Monkey,” 2022. [Online]. Available: https://developer.android.com/studio/test/other-testing-tools/monkey
  21. T. Gu, C. Sun, X. Ma, C. Cao, C. Xu, Y. Yao, Q. Zhang, J. Lu, and Z. Su, “Practical GUI Testing of Android Applications via Model Abstraction and Refinement,” in ICSE, 2019.
  22. A. Machiry, R. Tahiliani, and M. Naik, “Dynodroid: An Input Generation System for Android apps,” in FSE, 2013.
  23. R. Sasnauskas and J. Regehr, “Intent Fuzzer: Crafting Intents of Death,” in WODA PERTEA, 2014.
  24. L. Ravindranath, S. Nath, J. Padhye, and H. Balakrishnan, “Automatic and Scalable Fault Detection for Mobile Applications,” in MobiSys, 2014.
  25. K. Moran, M. Linares-Vásquez, C. Bernal-Cárdenas, C. Vendome, and D. Poshyvanyk, “Automatically Discovering, Reporting and Reproducing Android Application Crashes,” in ICST, 2016.
  26. Y. M. Baek and D. H. Bae, “Automated Model-based Android GUI Testing using Multi-level GUI Comparison Criteria,” in ASE, 2016.
  27. K. Mao, M. Harman, and Y. Jia, “Sapienz: Multi-objective Automated Testing for Android Applications,” in ISSTA, 2016.
  28. Z. Dong, M. Böhme, L. Cojocaru, and A. Roychoudhury, “Time-travel Testing of Android Apps,” in ICSE, 2020.
  29. Y. Li, Z. Yang, Y. Guo, and X. Chen, “Humanoid: A Deep Learning-Based Approach to Automated Black-box Android App Testing,” in ASE, 2019.
  30. B. Deka, Z. Huang, C. Franzen, J. Hibschman, D. Afergan, Y. Li, J. Nichols, and R. Kumar, “Rico: A Mobile App Dataset for Building Data-driven Design Applications,” in UIST, 2017.
  31. Y. Zheng, Y. Liu, X. Xie, Y. Liu, L. Ma, J. Hao, and Y. Liu, “Automatic Web Testing using Curiosity-driven Reinforcement Learning,” in ICSE, 2021.
  32. W. Wang, D. Li, W. Yang, Y. Cao, Z. Zhang, Y. Deng, and T. Xie, “An Empirical Study of Android Test Generation Tools in Industrial Cases,” in ASE, 2018.
  33. W. Wang, W. Yang, T. Xu, and T. Xie, “VET: Identifying and Avoiding UI Exploration Tarpits,” in ESEC/FSE, 2021.
  34. A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning Transferable Visual Models from Natural Language Supervision,” in ICML, 2021.
  35. “AURORA Replication Package,” 2024. [Online]. Available: https://sagelab.io/aurora
  36. “Imgur Google Play Store Page.” [Online]. Available: https://play.google.com/store/apps/details?id=com.imgur.mobile
  37. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in NeurIPS, 2012.
  38. K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in CVPR, 2016.
  39. D. Marutho, S. Hendra Handaka, E. Wijaya, and Muljono, “The Determination of Cluster Number at k-Mean Using Elbow Method and Purity Evaluation on Headline News,” in iSemantic, 2018.
  40. H. Zheng, D. Li, B. Liang, X. Zeng, W. Zheng, Y. Deng, W. Lam, W. Yang, and T. Xie, “Automated Test Input Generation for Android: Towards Getting There in an Industrial Case,” in ICSE-SEIP, 2017.
  41. G. Karpushkin, “The JSON Comparison package,” 2023. [Online]. Available: https://pypi.org/project/jsoncomparison
  42. X. Zhou, C. Yao, H. Wen, Y. Wang, S. Zhou, W. He, and J. Liang, “EAST: An Efficient and Accurate Scene Text Detector,” in CVPR, 2017.
  43. “Google Cloud Vision API.” [Online]. Available: https://cloud.google.com/vision/docs/ocr
  44. N. Reimers and I. Gurevych, “Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks,” 2019.
  45. W. Fu and T. Menzies, “Easy over Hard: A Case Study on Deep Learning,” in ESEC/FSE, 2017.
  46. “Tesseract OCR,” 2023. [Online]. Available: https://github.com/tesseract-ocr/tesseract
  47. A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving Language Understanding by Generative Pre-Training,” OpenAI, 2018.
  48. Swind, “pure python adb,” 2023. [Online]. Available: https://github.com/Swind/pure-python-adb
  49. T. J.-J. Li, L. Popowski, T. Mitchell, and B. A. Myers, “Screen2Vec: Semantic Embedding of GUI Screens and GUI Components,” in CHI, 2021.
  50. T. Su, G. Meng, Y. Chen, K. Wu, W. Yang, Y. Yao, G. Pu, Y. Liu, and Z. Su, “Guided, Stochastic Model-based GUI Testing of Android Apps,” in ESEC/FSE, 2017.
  51. “APE and Mini Trace Documentation,” 2023. [Online]. Available: http://gutianxiao.com/ape
  52. “Why Am I Not Seeing Any Ads?” [Online]. Available: https://support.applovin.com/hc/en-us/articles/4403932179597-Why-Am-I-Not-Seeing-Any-Ads
  53. M. Linares-Vasquez, M. White, C. Bernal-Cardenas, K. Moran, and D. Poshyvanyk, “Mining Android App Usages for Generating Actionable GUI-based Execution Scenarios,” in MSR, 2018.
  54. S. R. Choudhary, A. Gorla, and A. Orso, “Automated Test Input Generation for Android: Are We There Yet?” in ASE, 2015.
  55. S. Hao, B. Liu, S. Nath, W. G. Halfond, and R. Govindan, “PUMA: Programmable UI-automation for Large-scale Dynamic Analysis of Mobile Apps,” in MobiSys, 2014.
  56. T. Azim and I. Neamtiu, “Targeted and Depth-First Exploration for Systematic Testing of Android Apps,” in OOPSLA, 2013.
  57. D. Amalfitano, A. R. Fasolino, P. Tramontana, B. D. Ta, and A. M. Memon, “MobiGUITAR: Automated Model-Based Testing of Mobile Apps,” IEEE Software, 2015.
  58. M. Pan, A. Huang, G. Wang, T. Zhang, and X. Li, “Reinforcement Learning Based Curiosity-driven Testing of Android Applications,” in ISSTA, 2020.
  59. Z. Liu, C. Chen, J. Wang, X. Che, Y. Huang, J. Hu, and Q. Wang, “Fill in the Blank: Context-Aware Automated Text Input Generation for Mobile GUI Testing,” in ICSE, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Safwat Ali Khan (6 papers)
  2. Wenyu Wang (75 papers)
  3. Yiran Ren (1 paper)
  4. Bin Zhu (218 papers)
  5. Jiangfan Shi (2 papers)
  6. Alyssa McGowan (1 paper)
  7. Wing Lam (4 papers)
  8. Kevin Moran (66 papers)