Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large-scale, Independent and Comprehensive study of the power of LLMs for test case generation (2407.00225v2)

Published 28 Jun 2024 in cs.SE

Abstract: Unit testing, crucial for ensuring the reliability of code modules, such as classes and methods, is often overlooked by developers due to time constraints. Automated test generation techniques have emerged to address this, but they frequently lack readability and require significant developer intervention. LLMs, such as GPT and Mistral, have shown promise in software engineering tasks, including test generation, but their overall effectiveness remains unclear. This study presents an extensive investigation of LLMs, evaluating the effectiveness of four models and five prompt engineering techniques for unit test generation. We analyze 216 300 tests generated by the selected advanced instruct-tuned LLMs for 690 Java classes collected from diverse datasets. Our evaluation considers correctness, understandability, coverage, and test smell detection in the generated tests, comparing them to a widely used automated testing tool, EvoSuite. While LLMs demonstrate potential, improvements in test quality particularly in reducing common test smells are necessary. This study highlights the strengths and limitations of LLM-generated tests compared to traditional methods, paving the way for further research on LLMs in test automation.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Wendkûuni C. Ouédraogo (6 papers)
  2. Kader Kaboré (3 papers)
  3. Haoye Tian (26 papers)
  4. Yewei Song (9 papers)
  5. Anil Koyuncu (16 papers)
  6. Jacques Klein (89 papers)
  7. David Lo (229 papers)
  8. Tegawendé F. Bissyandé (82 papers)
Citations (2)