Test code generation at Ericsson using Program Analysis Augmented Fine Tuned LLMs (2506.11006v1)

Published 23 Apr 2025 in cs.SE

Abstract: We describe test code generation using LLMs in Ericsson. Our input is a test step in natural language (English) and our output is code (Java) which accomplishes the test step. We describe how straight forward prompting does not suffice and results in LLM assuming functions and signatures which are not present in the code repository. We then show how we alleviate the problem by a combination of Retrieval Augmented Generation (RAG) along with prompt engineering that expanded the simple prompt with additional contextual information using static program analysis. We then describe further improvements that we obtained by fine-tuning the underlying LLM. The fine tuning is done based on a custom designed prompt template which has pre-dependent classes, their public methods as well two exemplar outputs obtained from RAG. Our results establish that our fine tuned models help improve the correspondence or conformity with the original developer written test code as measured by the traditional metrics of F1-score based on the methods used in the generated code. Fine tuning of a 8x7b Mixture of Experts (MoE) model leads to an average improvement of 8\% over the base model and is comparable to the scores on a much larger 8x22b MoE model.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Test code generation at Ericsson using Program Analysis Augmented Fine Tuned LLMs (2506.11006v1)

Summary

Related Papers