Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
157 tokens/sec
GPT-4o
43 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Data Augmentation by Fuzzing for Neural Test Generation (2406.08665v2)

Published 12 Jun 2024 in cs.SE and cs.AI

Abstract: Testing is essential to modern software engineering for building reliable software. Given the high costs of manually creating test cases, automated test case generation, particularly methods utilizing LLMs, has become increasingly popular. These neural approaches generate semantically meaningful tests that are more maintainable compared with traditional automatic testing methods like fuzzing. However, the diversity and volume of unit tests in current datasets are limited. In this paper, we introduce a novel data augmentation technique, FuzzAug, that introduces the benefits of fuzzing to LLMs to preserve valid program semantics and provide diverse inputs. This enhances the model's ability to embed correct inputs that can explore more branches of the function under test. Our evaluations show that models trained with dataset augmented by FuzzAug increase assertion accuracy by 5%, improve compilation rate by more than 10%, and generate unit test functions with 5% more branch coverage. This technique demonstrates the potential of using dynamic software testing to improve neural test generation, offering significant enhancements in neural test generation.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com