Redefining Developer Assistance: Through Large Language Models in Software Ecosystem (2312.05626v3)
Abstract: In this paper, we delve into the advancement of domain-specific LLMs with a focus on their application in software development. We introduce DevAssistLlama, a model developed through instruction tuning, to assist developers in processing software-related natural language queries. This model, a variant of instruction tuned LLM, is particularly adept at handling intricate technical documentation, enhancing developer capability in software specific tasks. The creation of DevAssistLlama involved constructing an extensive instruction dataset from various software systems, enabling effective handling of Named Entity Recognition (NER), Relation Extraction (RE), and Link Prediction (LP). Our results demonstrate DevAssistLlama's superior capabilities in these tasks, in comparison with other models including ChatGPT. This research not only highlights the potential of specialized LLMs in software development also the pioneer LLM for this domain.
- 2021. Evaluating Large Language Models Trained on Code. arXiv:2107.03374 [cs.LG]
- 2023. Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv:2307.09288 [cs.CL]
- Can Machines Read Coding Manuals Yet? – A Benchmark for Building Better Language Models for Code Understanding. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2022).
- Anonymous. 2023. Anonymous. https://doi.org/10.5281/zenodo.8075578
- Tom B Brown et al. 2020. Language Models are Few-Shot Learners. arXiv preprint arXiv:2005.14165 (2020).
- Teaching Large Language Models to Self-Debug. arXiv:2304.05128 [cs.CL]
- LLM-Assisted Content Analysis: Using Large Language Models to Support Deductive Coding. arXiv:2306.14924 [cs.CL]
- Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality. https://lmsys.org/blog/2023-03-30-vicuna/
- Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca. arXiv:2304.08177 [cs.CL]
- GitHub Copilot AI pair programmer: Asset or Liability? arXiv:2206.15331 [cs.SE]
- M Syauqi Haris and Tri Astoto Kurniawan. 2021. Automated Requirement Sentences Extraction from Software Requirement Specification Document. In Proceedings of the 5th International Conference on Sustainable Information Engineering and Technology. Association for Computing Machinery, New York, NY, USA, 142–147. https://doi.org/10.1145/3427423.3427450
- LoRA: Low-Rank Adaptation of Large Language Models. arXiv:2106.09685 [cs.CL]
- 65-billion-parameter language model Introducing LLaMA: A foundational. [n. d.]. https://ai.meta.com/blog/large-language-model-llama-meta-ai/
- PUnifiedNER: A Prompting-Based Unified NER System for Diverse Datasets. Proceedings of the AAAI Conference on Artificial Intelligence 37, 11 (Jun. 2023), 13327–13335. https://doi.org/10.1609/aaai.v37i11.26564
- Software Entity Recognition with Noise-Robust Learning. In Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE’23). IEEE/ACM.
- Language models are unsupervised multitask learners. https://www.semanticscholar.org/paper/Language-Models-are-Unsupervised-Multitask-Learners-Radford-Wu/9405cc0d6169988371b2755e573cc28650d14dfe
- The Programmer’s Assistant: Conversational Interaction with a Large Language Model for Software Development. In Proceedings of the 28th International Conference on Intelligent User Interfaces (IUI ’23). Association for Computing Machinery, New York, NY, USA, 491–514. https://doi.org/10.1145/3581641.3584037
- An Empirical Evaluation of Using Large Language Models for Automated Unit Test Generation. arXiv:2302.06527 [cs.SE]
- Code and Named Entity Recognition in StackOverflow. In The Annual Meeting of the Association for Computational Linguistics (ACL).
- Stanford Alpaca: An Instruction-following LLaMA model. https://github.com/tatsu-lab/stanford_alpaca.
- EcoAssistant: Using LLM Assistant More Affordably and Accurately. arXiv preprint arXiv:2310.03046 (2023).
- UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition. (2023). arXiv:2308.03279 [cs.CL]