AndroLibZoo: A Reliable Dataset of Libraries Based on Software Dependency Analysis (2307.12609v3)
Abstract: Android app developers extensively employ code reuse, integrating many third-party libraries into their apps. While such integration is practical for developers, it can be challenging for static analyzers to achieve scalability and precision when libraries account for a large part of the code. As a direct consequence, it is common practice in the literature to consider developer code only during static analysis --with the assumption that the sought issues are in developer code rather than the libraries. However, analysts need to distinguish between library and developer code. Currently, many static analyses rely on white lists of libraries. However, these white lists are unreliable, inaccurate, and largely non-comprehensive. In this paper, we propose a new approach to address the lack of comprehensive and automated solutions for the production of accurate and ``always up to date" sets of libraries. First, we demonstrate the continued need for a white list of libraries. Second, we propose an automated approach to produce an accurate and up-to-date set of third-party libraries in the form of a dataset called AndroLibZoo. Our dataset, which we make available to the community, contains to date 34 813 libraries and is meant to evolve.
- Droidapiminer: Mining api-level features for robust malware detection in android. In Security and Privacy in Communication Networks (Cham, 2013), T. Zia, A. Zomaya, V. Varadharajan, and M. Mao, Eds., Springer International Publishing, pp. 86–103.
- Androzoo: Collecting millions of android apps for the research community. In Proceedings of the 13th International Conference on Mining Software Repositories (New York, NY, USA, 2016), MSR ’16, ACM, pp. 468–471.
- Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. ACM SIGPLAN NOTICES 49, 6 (June 2014), 259–269.
- Mining apps for abnormal usage of sensitive data. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (2015), ICSE ’15, IEEE Press, p. 426–436.
- Reliable third-party library detection in android and its security applications. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (New York, NY, USA, 2016), CCS ’16, Association for Computing Machinery, p. 356–367.
- Fam: Featuring android malware for deep learning-based familial analysis. IEEE Access 10 (2022), 20008–20018.
- Apache lucene 4. In SIGIR 2012 workshop on open source information retrieval (2012), p. 17.
- Automatically Identifying Trigger-based Behavior in Malware. Springer US, Boston, MA, 2008, pp. 65–88.
- Achieving accuracy and scalability simultaneously in detecting application clones on android markets. In Proceedings of the 36th International Conference on Software Engineering (New York, NY, USA, 2014), ICSE 2014, Association for Computing Machinery, p. 175–186.
- Triggerscope: Towards detecting logic bombs in android applications. In 2016 IEEE Symposium on Security and Privacy (SP) (2016), pp. 377–396.
- Unsafe exposure analysis of mobile in-app advertisements. In Proceedings of the Fifth ACM Conference on Security and Privacy in Wireless and Mobile Networks (New York, NY, USA, 2012), WISEC ’12, Association for Computing Machinery, p. 101–112.
- Gradle. Google maven repositoryhttps://docs.gradle.org/current/userguide/declaring_repositories.html#sub:maven_google, 2022. Accessed December 2022.
- Gradle. Maven central repositoryhttps://docs.gradle.org/current/userguide/declaring_repositories.html#sub:maven_central, 2022. Accessed December 2022.
- JFrog. Jcenter, https://developer.android.com/studio/build/jcenter-migration, 2023. Accessed Apr. 2023.
- Iccta: Detecting inter-component privacy leaks in android apps. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (2015), ICSE ’15, IEEE Press, p. 280–291.
- An investigation into the use of common libraries in android apps. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER) (2016), vol. 1, pp. 403–414.
- Static analysis of android apps: A systematic literature review. Information and Software Technology 88 (2017), 67–95.
- Revisiting the impact of common libraries for android-related investigations. Journal of Systems and Software 154 (2019), 157–175.
- Libradar: Fast and accurate detection of third-party libraries in android apps. In Proceedings of the 38th International Conference on Software Engineering Companion (New York, NY, USA, 2016), ICSE ’16, Association for Computing Machinery, p. 653–656.
- Maintainer, F. Flowdroid’s systemclasshandler class https://github.com/secure-software-engineering/FlowDroid/blob/develop/soot-infoflow/src/soot/jimple/infoflow/util/SystemClassHandler.java, 2023. Accessed January 2023.
- Raicc: Revealing atypical inter-component communication in android apps. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE) (Los Alamitos, CA, USA, May 2021), IEEE Computer Society, pp. 1398–1409.
- Jucify: A step towards android code unification for enhanced static analysis. In 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE) (Los Alamitos, CA, USA, May 2022), IEEE Computer Society, pp. 1232–1244.
- Difuzer: Uncovering suspicious hidden sensitive operations in android apps. In 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE) (Los Alamitos, CA, USA, May 2022), IEEE Computer Society, pp. 723–735.
- Demystifying hidden sensitive operations in android apps. ACM Trans. Softw. Eng. Methodol. (dec 2022). Just Accepted.
- Orlis: Obfuscation-resilient library detection for android. In 2018 IEEE/ACM 5th International Conference on Mobile Software Engineering and Systems (MOBILESoft) (2018), pp. 13–23.
- Amandroid: A precise and general inter-component data flow analysis framework for security vetting of android apps. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security (New York, NY, USA, 2014), CCS ’14, Association for Computing Machinery, p. 1329–1341.
- Automated third-party library detection for android applications: Are we there yet? In 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE) (2020), pp. 919–930.
- Automated third-party library detection for android applications: Are we there yet? In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering (New York, NY, USA, 2021), ASE ’20, Association for Computing Machinery, p. 919–930.
- Research on third-party libraries in android apps: A taxonomy and systematic literature review. IEEE Transactions on Software Engineering 48, 10 (2022), 4181–4213.
- Detecting third-party libraries in android applications with high precision and recall. In 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER) (2018), pp. 141–152.