Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CodePori: Large-Scale System for Autonomous Software Development Using Multi-Agent Technology (2402.01411v2)

Published 2 Feb 2024 in cs.SE

Abstract: Context: LLMs and Generative Pre-trained Transformers (GPTs) have transformed the field of Software Engineering (SE). Existing LLM-based multi-agent models have successfully addressed basic dialogue tasks. However, the potential of LLMs for more challenging tasks, such as automated code generation for large and complex projects, has been investigated in only a few existing works. Objective: This paper aims to investigate the potential of LLM-based agents in the software industry, particularly in enhancing productivity and reducing time-to-market for complex software solutions. Our primary objective is to gain insights into how these agents can fundamentally transform the development of large-scale software. Methods: We introduce CodePori, a novel system designed to automate code generation for large and complex software projects based on functional and non-functional requirements defined by stakeholders. To assess the proposed system performance, we utilized the HumanEval benchmark and manually tested the CodePori model, providing 20 different project descriptions as input and then evaluated the code accuracy by manually executing the code. Results: CodePori is able to generate running code for large-scale projects, aligned with the typical software development process. The HumanEval benchmark results indicate that CodePori improves code accuracy by 89%. A manual assessment conducted by the first author shows that the CodePori system achieved an accuracy rate of 85%. Conclusion: Based on the results, our conclusion is that proposed system demonstrates the transformative potential of LLM-based agents in SE, highlighting their practical applications and opening new opportunities for broader adoption in both industry and academia. Our project is publicly available at https://github.com/GPT-Laboratory/CodePori.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Learning to represent programs with graphs. arXiv preprint arXiv:1711.00740 (2017).
  2. Program synthesis with large language models. arXiv preprint arXiv:2108.07732 (2021).
  3. Ömer Aydın and Enis Karaarslan. 2023. Is ChatGPT leading generative AI? What is beyond expectations? What is beyond expectations (2023).
  4. David Baidoo-Anu and Leticia Owusu Ansah. 2023. Education in the era of generative artificial intelligence (AI): Understanding the potential benefits of ChatGPT in promoting teaching and learning. Available at SSRN 4337484 (2023).
  5. Large language model assisted software engineering: prospects, challenges, and a case study. In International Conference on Bridging the Gap between AI and Reality. Springer, 355–374.
  6. Gpt-neo: Large scale autoregressive language modeling with mesh-tensorflow. If you use this software, please cite it using these metadata 58 (2021).
  7. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
  8. A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt. arXiv preprint arXiv:2303.04226 (2023).
  9. Youngjin Chae and Thomas Davidson. 2023. Large Language Models for Text Classification: From Zero-shot Learning to Fine-Tuning. Open Science Foundation (2023).
  10. Codet: Code generation with generated tests. arXiv preprint arXiv:2207.10397 (2022).
  11. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021).
  12. Palm: Scaling language modeling with pathways. Journal of Machine Learning Research 24, 240 (2023), 1–113.
  13. Self-collaboration Code Generation via ChatGPT. arXiv preprint arXiv:2304.07590 (2023).
  14. Investigating Code Generation Performance of Chat-GPT with Crowdsourcing Social Data. In Proceedings of the 47th IEEE Computer Software and Applications Conference. 1–10.
  15. Luciano Floridi and Massimo Chiriatti. 2020. GPT-3: Its nature, scope, limits, and consequences. Minds and Machines 30 (2020), 681–694.
  16. Incoder: A generative model for code infilling and synthesis. arXiv preprint arXiv:2204.05999 (2022).
  17. Roberto Gozalo-Brizuela and Eduardo C Garrido-Merchan. 2023. ChatGPT is not all you need. A State of the Art Review of large Generative AI models. arXiv preprint arXiv:2301.04655 (2023).
  18. Deep code search. In Proceedings of the 40th International Conference on Software Engineering. 933–944.
  19. Regulating ChatGPT and other large generative AI models. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency. 1112–1123.
  20. Metagpt: Meta programming for multi-agent collaborative framework. arXiv preprint arXiv:2308.00352 (2023).
  21. Large language models for software engineering: A systematic literature review. arXiv preprint arXiv:2308.10620 (2023).
  22. Competition-level code generation with alphacode. Science 378, 6624 (2022), 1092–1097.
  23. Machine learning model development from a software engineering perspective: A systematic literature review. arXiv preprint arXiv:2102.07574 (2021).
  24. The Scope of ChatGPT in Software Engineering: A Thorough Investigation. arXiv preprint arXiv:2305.12138 (2023).
  25. Comparing Software Developers with ChatGPT: An Empirical Investigation. arXiv preprint arXiv:2305.11837 (2023).
  26. Codegen: An open large language model for code with multi-turn program synthesis. arXiv preprint arXiv:2203.13474 (2022).
  27. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems 35 (2022), 27730–27744.
  28. Communicative agents for software development. arXiv preprint arXiv:2307.07924 (2023).
  29. Improving language understanding by generative pre-training. (2018).
  30. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.
  31. Scaling language models: Methods, analysis & insights from training gopher. arXiv preprint arXiv:2112.11446 (2021).
  32. Autonomous Agents in Software Development: A Vision Paper. arXiv preprint arXiv:2311.18440 (2023).
  33. Denis Rothman and Antonio Gulli. 2022. Transformers for Natural Language Processing: Build, train, and fine-tune deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, and GPT-3. Packt Publishing Ltd.
  34. Christoph Treude. 2023. Navigating Complexity in Software Engineering: A Prototype for Comparing GPT-n Solutions. arXiv preprint arXiv:2301.12169 (2023).
  35. Natural language processing with transformers. ” O’Reilly Media, Inc.”.
  36. Felipe Urrutia and Roberto Araya. 2024. Who’s the Best Detective? Large Language Models vs. Traditional Machine Learning in Detecting Incoherent Fourth Grade Math Answers. Journal of Educational Computing Research 61, 8 (2024), 187–218.
  37. Attention is all you need. Advances in neural information processing systems 30 (2017).
  38. Ben Wang and Aran Komatsuzaki. 2021. GPT-J-6B: A 6 billion parameter autoregressive language model.
  39. Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv preprint arXiv:2109.00859 (2021).
  40. Emergent abilities of large language models. arXiv preprint arXiv:2206.07682 (2022).
  41. A systematic evaluation of large language models of code. In Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming. 1–10.
  42. Codegeex: A pre-trained model for code generation with multilingual evaluations on humaneval-x. arXiv preprint arXiv:2303.17568 (2023).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Zeeshan Rasheed (23 papers)
  2. Muhammad Waseem (66 papers)
  3. Mika Saari (9 papers)
  4. Kari Systä (11 papers)
  5. Pekka Abrahamsson (105 papers)
  6. Malik Abdul Sami (8 papers)
  7. Kai-Kristian Kemell (36 papers)
Citations (13)
Youtube Logo Streamline Icon: https://streamlinehq.com