Using LLMs in Software Requirements Specifications: An Empirical Evaluation (2404.17842v1)
Abstract: The creation of a Software Requirements Specification (SRS) document is important for any software development project. Given the recent prowess of LLMs in answering natural language queries and generating sophisticated textual outputs, our study explores their capability to produce accurate, coherent, and structured drafts of these documents to accelerate the software development lifecycle. We assess the performance of GPT-4 and CodeLlama in drafting an SRS for a university club management system and compare it against human benchmarks using eight distinct criteria. Our results suggest that LLMs can match the output quality of an entry-level software engineer to generate an SRS, delivering complete and consistent drafts. We also evaluate the capabilities of LLMs to identify and rectify problems in a given requirements document. Our experiments indicate that GPT-4 is capable of identifying issues and giving constructive feedback for rectifying them, while CodeLlama's results for validation were not as encouraging. We repeated the generation exercise for four distinct use cases to study the time saved by employing LLMs for SRS generation. The experiment demonstrates that LLMs may facilitate a significant reduction in development time for entry-level software engineers. Hence, we conclude that the LLMs can be gainfully used by software engineers to increase productivity by saving time and effort in generating, validating and rectifying software requirements.
- OpenAI, “Chatgpt: Optimizing language models for dialogue,” 2023. [Online]. Available: https://www.openai.com/chatgpt
- B. Roziere, J. Gehring, F. Gloeckle, S. Sootla, I. Gat, X. E. Tan, Y. Adi, J. Liu, T. Remez, J. Rapin et al., “Code llama: Open foundation models for code,” arXiv preprint arXiv:2308.12950, 2023.
- “Ieee recommended practice for software requirements specifications,” IEEE Std 830-1998, pp. 1–40, 1998.
- “Iso/iec/ieee international standard - systems and software engineering – life cycle processes – requirements engineering,” ISO/IEC/IEEE 29148:2018(E), pp. 1–104, 2018.
- Y. Yang, X. Xia, D. Lo, and J. Grundy, “A survey on deep learning for software engineering,” ACM Computing Surveys (CSUR), vol. 54, no. 10s, pp. 1–73, 2022.
- D. Kici, G. Malik, M. Cevik, D. Parikh, and A. Basar, “A bert-based transfer learning approach to text classification on software requirements specifications.” in Canadian Conference on AI, vol. 1, 2021, p. 042077.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
- A. Radford, K. Narasimhan, T. Salimans, I. Sutskever et al., “Improving language understanding by generative pre-training,” 2018.
- A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
- H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar et al., “Llama: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971, 2023.
- J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, D. Almeida, J. Altenschmidt, S. Altman, S. Anadkat et al., “Gpt-4 technical report,” arXiv preprint arXiv:2303.08774, 2023.
- T. Clancy, “The chaos report,” The Standish Group, 1995.
- M. I. Kamata and T. Tamai, “How does requirements quality relate to project success or failure?” in 15th IEEE International Requirements Engineering Conference (RE 2007). IEEE, 2007, pp. 69–78.
- E. Knauss, C. El Boustani, and T. Flohr, “Investigating the impact of software requirements specification quality on project success,” in Product-Focused Software Process Improvement: 10th International Conference, PROFES 2009, Oulu, Finland, June 15-17, 2009. Proceedings 10. Springer, 2009, pp. 28–42.
- K. E. Wiegers, “Writing quality requirements,” Software Development, vol. 7, no. 5, pp. 44–48, 1999.
- A. Davis, S. Overmyer, K. Jordan, J. Caruso, F. Dandashi, A. Dinh, G. Kincaid, G. Ledeboer, P. Reynolds, P. Sitaram et al., “Identifying and measuring quality in a software requirements specification,” in [1993] Proceedings First International Software Metrics Symposium. Ieee, 1993, pp. 141–152.
- W. M. Wilson, “Writing effective natural language requirements specifications,” Naval Research Laboratory, 1999.
- A. A. Alshazly, A. M. Elfatatry, and M. S. Abougabal, “Detecting defects in software requirements specification,” Alexandria Engineering Journal, vol. 53, no. 3, pp. 513–527, 2014.
- F. Fabbrini, M. Fusani, S. Gnesi, and G. Lami, “Quality evaluation of software requirement specifications,” in Proceedings of the software and internet quality week 2000 conference, 2000, pp. 1–18.
- M. G. Georgiades, A. S. Andreou, and C. S. Pattichis, “A requirements engineering methodology based on natural language syntax and semantics,” in 13th IEEE International Conference on Requirements Engineering (RE’05). IEEE, 2005, pp. 473–474.
- M. G. Georgiades and A. S. Andreou, “Automatic generation of a software requirements specification (srs) document,” in 2010 10th International Conference on Intelligent Systems Design and Applications. IEEE, 2010, pp. 1095–1100.
- S. Mandal, A. Chethan, V. Janfaza, S. Mahmud, T. A. Anderson, J. Turek, J. J. Tithi, and A. Muzahid, “Large language models based automatic synthesis of software specifications,” arXiv preprint arXiv:2304.09181, 2023.
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
- S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
- V. Castañeda, L. C. Ballejos, and M. L. Caliusco, “Improving the quality of software requirements specifications with semantic web technologies.” in WER, 2012.
- K. Siegemund, Y. Zhao, J. Z. Pan, and U. Aßmann, “Measure software requirement specifications by ontology reasoning,” in 8th International Workshop on Semantic Web Enabled Software Engineering (SWESE’2012), 2012.
- X. Wei, Z. Wang, and S. Yang, “An automatic generation and verification method of software requirements specification,” Electronics, vol. 12, no. 12, p. 2734, 2023.
- J. Sun, Q. V. Liao, M. Muller, M. Agarwal, S. Houde, K. Talamadupula, and J. D. Weisz, “Investigating explainability of generative ai for code through scenario-based design,” in 27th International Conference on Intelligent User Interfaces, 2022, pp. 212–228.
- M. Daun and J. Brings, “How chatgpt will change software engineering education,” in Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1, 2023, pp. 110–116.
- A. S. Pothukuchi, L. V. Kota, and V. Mallikarjunaradhya, “Impact of generative ai on the software development lifecycle (sdlc),” International Journal of Creative Research Thoughts, vol. 11, no. 8, 2023.
- T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
- J. Wei, X. Wang, D. Schuurmans, M. Bosma, F. Xia, E. Chi, Q. V. Le, D. Zhou et al., “Chain-of-thought prompting elicits reasoning in large language models,” Advances in neural information processing systems, vol. 35, pp. 24 824–24 837, 2022.
- T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, and Y. Iwasawa, “Large language models are zero-shot reasoners,” Advances in neural information processing systems, vol. 35, pp. 22 199–22 213, 2022.
- J. Long, “Large language model guided tree-of-thought,” arXiv preprint arXiv:2305.08291, 2023.
- S. Yao, D. Yu, J. Zhao, I. Shafran, T. L. Griffiths, Y. Cao, and K. Narasimhan, “Tree of thoughts: Deliberate problem solving with large language models, 2023,” URL https://arxiv. org/pdf/2305.10601. pdf, 2023.
- S. Arvidsson and J. Axell, “Prompt engineering guidelines for llms in requirements engineering,” 2023.
- C. Arora, J. Grundy, and M. Abdelrazek, “Advancing requirements engineering through generative ai: Assessing the role of llms,” arXiv preprint arXiv:2310.13976, 2023.
- T. Rahman and Y. Zhu, “Automated user story generation with test case specification using large language model,” arXiv preprint arXiv:2404.01558, 2024.
- oobabooga, “Text generation webui,” GitHub. [Online]. Available: https://github.com/oobabooga/text-generation-webui
- ggerganov, “ggml,” GitHub. [Online]. Available: https://github.com/ggerganov/ggml
- G. Gerganov, “llama.cpp,” GitHub. [Online]. Available: https://github.com/ggerganov/llama.cpp
- Madhava Krishna (24 papers)
- Bhagesh Gaur (1 paper)
- Arsh Verma (5 papers)
- Pankaj Jalote (5 papers)