The Emergence of Large Language Models in Static Analysis: A First Look through Micro-Benchmarks (2402.17679v1)
Abstract: The application of LLMs in software engineering, particularly in static analysis tasks, represents a paradigm shift in the field. In this paper, we investigate the role that current LLMs can play in improving callgraph analysis and type inference for Python programs. Using the PyCG, HeaderGen, and TypeEvalPy micro-benchmarks, we evaluate 26 LLMs, including OpenAI's GPT series and open-source models such as LLaMA. Our study reveals that LLMs show promising results in type inference, demonstrating higher accuracy than traditional methods, yet they exhibit limitations in callgraph analysis. This contrast emphasizes the need for specialized fine-tuning of LLMs to better suit specific static analysis tasks. Our findings provide a foundation for further research towards integrating LLMs for static analysis tasks.
- [n. d.]. Hugging Face – The AI Community Building the Future. https://huggingface.co/.
- [n. d.]. Langchain-Ai/Langchain: Building Applications with LLMs through Composability. https://github.com/langchain-ai/langchain.
- [n. d.]. Ollama. https://ollama.ai.
- Typilus: Neural Type Hints (PLDI 2020). Association for Computing Machinery, New York, NY, USA, 91–105. https://doi.org/10.1145/3385412.3385997
- Unleashing the Potential of Prompt Engineering in Large Language Models: A Comprehensive Review. arXiv:2310.14735 [cs]
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 [cs]
- Large Language Models for Software Engineering: Survey and Open Problems. https://arxiv.org/abs/2310.03533v4.
- Large Language Models for Software Engineering: A Systematic Literature Review. https://doi.org/10.48550/arXiv.2308.10620 arXiv:2308.10620 [cs]
- Assisting Static Analysis with Large Language Models: A ChatGPT Experiment. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (¡conf-loc¿, ¡city¿San Francisco¡/city¿, ¡state¿CA¡/state¿, ¡country¿USA¡/country¿, ¡/conf-loc¿) (ESEC/FSE 2023). Association for Computing Machinery, New York, NY, USA, 2107–2111. https://doi.org/10.1145/3611643.3613078
- The Hitchhiker’s Guide to Program Analysis: A Journey with Large Language Models. https://doi.org/10.48550/arXiv.2308.00245 arXiv:2308.00245 [cs]
- The Scope of ChatGPT in Software Engineering: A Thorough Investigation.
- Type4Py: Practical Deep Similarity Learning-Based Type Inference for Python. In Proceedings of the 44th International Conference on Software Engineering (ICSE ’22). Association for Computing Machinery, New York, NY, USA, 2241–2252. https://doi.org/10.1145/3510003.3510124
- Static Inference Meets Deep Learning: A Hybrid Type Inference Approach for Python. In Proceedings of the 44th International Conference on Software Engineering (ICSE ’22). Association for Computing Machinery, New York, NY, USA, 2019–2030. https://doi.org/10.1145/3510003.3510038
- Language Models Are Unsupervised Multitask Learners. ([n. d.]).
- Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. https://doi.org/10.48550/arXiv.1910.10683 arXiv:1910.10683 [cs, stat]
- PyCG: Practical Call Graph Generation in Python. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 1646–1657. https://doi.org/10.1109/ICSE43902.2021.00146
- Learning Type Inference for Enhanced Dataflow Analysis. arXiv:2310.00673 [cs.LG]
- Automatic Code Summarization via ChatGPT: How Far Are We? arXiv:2305.12865 [cs.SE]
- TypeEvalPy: A Micro-benchmarking Framework for Python Type Inference Tools. https://doi.org/10.48550/arXiv.2312.16882 arXiv:2312.16882 [cs]
- Enhancing Comprehension and Navigation in Jupyter Notebooks with Static Analysis. In 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE Computer Society, 391–401. https://doi.org/10.1109/SANER56733.2023.00044
- A Survey on Large Language Models for Software Engineering. https://doi.org/10.48550/arXiv.2312.15223 arXiv:2312.15223 [cs]
- Towards an Understanding of Large Language Models in Software Engineering Tasks. https://doi.org/10.48550/arXiv.2308.11396 arXiv:2308.11396 [cs]
- A Survey on Model Compression for Large Language Models. https://doi.org/10.48550/arXiv.2308.07633 arXiv:2308.07633 [cs]