LLM-CompDroid: Repairing Configuration Compatibility Bugs in Android Apps with Pre-trained Large Language Models (2402.15078v1)
Abstract: XML configurations are integral to the Android development framework, particularly in the realm of UI display. However, these configurations can introduce compatibility issues (bugs), resulting in divergent visual outcomes and system crashes across various Android API versions (levels). In this study, we systematically investigate LLM-based approaches for detecting and repairing configuration compatibility bugs. Our findings highlight certain limitations of LLMs in effectively identifying and resolving these bugs, while also revealing their potential in addressing complex, hard-to-repair issues that traditional tools struggle with. Leveraging these insights, we introduce the LLM-CompDroid framework, which combines the strengths of LLMs and traditional tools for bug resolution. Our experimental results demonstrate a significant enhancement in bug resolution performance by LLM-CompDroid, with LLM-CompDroid-GPT-3.5 and LLM-CompDroid-GPT-4 surpassing the state-of-the-art tool, ConfFix, by at least 9.8% and 10.4% in both Correct and Correct@k metrics, respectively. This innovative approach holds promise for advancing the reliability and robustness of Android applications, making a valuable contribution to the field of software development.
- 2019. Music-Player-GO. https://github.com/enricocid/Music-Player-GO/tree/aef85dc.
- 2020. Tachiyomi. https://github.com/CarlosEsco/Neko/tree/885c7bbb103dde6ca6b1b47cfefc6b9ea5ea231c.
- 2023a. All Android releases. https://developer.android.com/about/versions.
- 2023b. Developers. https://developer.android.com/.
- 2023a. GitHub. https://github.com/.
- 2023b. GitHub Issues. https://docs.github.com/en/issues/tracking-your-work-with-issues/about-issues.
- 2023. Google Android Lint. https://developer.android.com/studio/write/lint. Accessed: 2023-08-30.
- 2023. Google Bard. https://bard.google.com/. Accessed: 2023-08-30.
- 2023. OpenAI. https://chat.openai.com/. Accessed: 2023-08-30.
- 2023. OpenAI Documentation. https://platform.openai.com/docs/models/overview.
- 2023. Stack Overflow - Where Developers Learn, Share, & Build Careers. https://stackoverflow.com/.
- 2024. Online artifact. https://zenodo.org/records/10618818.
- Large language models in machine translation. (2007).
- Extracting training data from large language models. In 30th USENIX Security Symposium (USENIX Security 21). 2633–2650.
- Extracting Training Data from Large Language Models.. In USENIX Security Symposium, Vol. 6.
- Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021).
- Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. 423–435.
- Atish Kumar Dipongkor and Kevin Moran. 2023. A Comparative Study of Transformer-based Neural Text Representation Techniques on Bug Triaging. In 38th IEEE/ACM International Conference on Automated Software Engineering (ASE 2023). IEEE.
- Automated repair of programs from large language models. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 1469–1481.
- Mattia Fazzini and Alessandro Orso. 2017. Automated cross-platform inconsistency detection for mobile apps. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). 308–318.
- Automated API-usage update for Android apps. In Proceedings of the 28th ACM SIGSOFT international symposium on software testing and analysis. 204–215.
- Automatic Android deprecated-API usage update by learning from single updated example. In Proceedings of the 28th international conference on program comprehension. 401–405.
- AndroEvolve: automated Android API update with data flow analysis and variable denormalization. Empirical Software Engineering 27, 3 (2022), 73.
- Understanding and detecting evolution-induced compatibility issues in android apps. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 167–177.
- Understanding and detecting callback compatibility issues for android applications. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 532–542.
- Characterizing and Detecting Configuration Compatibility Issues in Android Apps. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). 517–528.
- ConfFix: Repairing Configuration Compatibility Issues in Android Apps. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2023). 514–525.
- Which Bugs Are Missed in Code Reviews: An Empirical Study on SmartSHARK Dataset. In Proceedings of the 19th International Conference on Mining Software Repositories. 137–141.
- Mimic: UI compatibility testing system for Android apps. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 246–256.
- A3: Assisting android api migrations using code examples. IEEE Transactions on Software Engineering 48, 2 (2020), 417–431.
- A Light Bug Triage Framework for Applying Large Pre-trained Language Model. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering. 1–11.
- CODAMOSA: Escaping coverage plateaus in test generation with pre-trained large language models. In International conference on software engineering (ICSE).
- Elegant: Towards effective location of fragmentation-induced compatibility issues for android apps. In 2018 25th Asia-Pacific Software Engineering Conference (APSEC). 278–287.
- Cid: Automating the detection of api-related compatibility issues in android apps. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 153–163.
- Automatically detecting api-induced compatibility issues in android apps: A comparative analysis (replicability study). In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis. 617–628.
- Not The End of Story: An Evaluation of ChatGPT-Driven Vulnerability Description Mappings. In Findings of the Association for Computational Linguistics: ACL 2023. 3724–3731.
- Refining ChatGPT-generated code: Characterizing and mitigating code quality issues. ACM Transactions on Software Engineering and Methodology (2023).
- Fill in the blank: Context-aware automated text input generation for mobile gui testing. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 1355–1367.
- Chatting with GPT-3 for Zero-Shot Human-Like Mobile Automated GUI Testing. arXiv preprint arXiv:2305.09434 (2023).
- No Need to Lift a Finger Anymore? Assessing the Quality of Code Generation by ChatGPT. arXiv preprint arXiv:2308.04838 (2023).
- Automated repair of layout cross browser issues using search-based techniques. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis. https://doi.org/10.1145/3092703.3092726
- Exploring the Capacity of a Large-scale Masked Language Model to Recognize Grammatical Errors. arXiv preprint arXiv:2108.12216 (2021).
- OpenAI. 2020. OpenAI API.
- OpenAI. 2023. GPT-4 Technical Report.
- Multi-criteria code refactoring using search-based software engineering: An industrial case study. ACM Transactions on Software Engineering and Methodology (TOSEM) 25, 3 (2016), 1–53.
- Asleep at the keyboard? assessing the security of github copilot’s code contributions. In 2022 IEEE Symposium on Security and Privacy (SP). IEEE, 754–768.
- Examining zero-shot vulnerability repair with large language models. In 2023 IEEE Symposium on Security and Privacy (SP). IEEE, 2339–2356.
- Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. 21, 1 (2022).
- Data-Driven Solutions to Detect API Compatibility Issues in Android: An Empirical Study. In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). 288–298.
- When GPT Meets Program Analysis: Towards Intelligent Detection of Smart Contract Logic Vulnerabilities in GPTScan. arXiv preprint arXiv:2308.03314 (2023).
- ChatGPT vs SBST: A Comparative Assessment of Unit Test Suite Generation. arXiv preprint arXiv:2307.00588 (2023).
- Andromeda: Accurate and scalable security analysis of web applications. In Fundamental Approaches to Software Engineering: 16th International Conference, FASE 2013, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2013, Rome, Italy, March 16-24, 2013. Proceedings 16. Springer, 210–225.
- Taming android fragmentation: Characterizing and detecting compatibility issues for android apps. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering. 226–237.
- Pivot: learning API-device correlations to facilitate Android compatibility issue detection. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).
- Understanding and detecting fragmentation-induced compatibility issues for android apps. IEEE Transactions on Software Engineering 46, 11 (2018), 1176–1199.
- Universal fuzzing via large language models. arXiv preprint arXiv:2308.04748 (2023).
- Automated program repair in the era of large pre-trained language models. In Proceedings of the 45th International Conference on Software Engineering (ICSE 2023). Association for Computing Machinery.
- Towards Automatically Repairing Compatibility Issues in Published Android Apps. In Proceedings of the 44th International Conference on Software Engineering (ICSE). 2142–2153.
- Zhijie Liu (16 papers)
- Yutian Tang (17 papers)
- Meiyun Li (1 paper)
- Xin Jin (285 papers)
- Yunfei Long (26 papers)
- Liang Feng Zhang (16 papers)
- Xiapu Luo (106 papers)