Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 86 tok/s

Gemini 2.5 Pro 45 tok/s Pro

GPT-5 Medium 23 tok/s Pro

GPT-5 High 25 tok/s Pro

GPT-4o 111 tok/s Pro

Kimi K2 178 tok/s Pro

GPT OSS 120B 452 tok/s Pro

Claude Sonnet 4 37 tok/s Pro

2000 character limit reached

PRIMG : Efficient LLM-driven Test Generation Using Mutant Prioritization (2505.05584v1)

Published 8 May 2025 in cs.SE and cs.LG

Abstract: Mutation testing is a widely recognized technique for assessing and enhancing the effectiveness of software test suites by introducing deliberate code mutations. However, its application often results in overly large test suites, as developers generate numerous tests to kill specific mutants, increasing computational overhead. This paper introduces PRIMG (Prioritization and Refinement Integrated Mutation-driven Generation), a novel framework for incremental and adaptive test case generation for Solidity smart contracts. PRIMG integrates two core components: a mutation prioritization module, which employs a machine learning model trained on mutant subsumption graphs to predict the usefulness of surviving mutants, and a test case generation module, which utilizes LLMs to generate and iteratively refine test cases to achieve syntactic and behavioral correctness. We evaluated PRIMG on real-world Solidity projects from Code4Arena to assess its effectiveness in improving mutation scores and generating high-quality test cases. The experimental results demonstrate that PRIMG significantly reduces test suite size while maintaining high mutation coverage. The prioritization module consistently outperformed random mutant selection, enabling the generation of high-impact tests with reduced computational effort. Furthermore, the refining process enhanced the correctness and utility of LLM-generated tests, addressing their inherent limitations in handling edge cases and complex program logic.

Collections

Summary

The paper introduces PRIMG, a framework that combines ML-based mutant prioritization with iterative LLM-driven test generation to improve test effectiveness for Solidity contracts.
The mutation prioritization module uses Test Completeness Advancement Probability to target high-impact mutants, reducing redundancy and computational overhead.
Experimental results on real-world Solidity projects show increased mutation scores and efficient, compact test suite generation.

Efficient LLM-driven Test Generation: The PRIMG Approach

Introduction

The paper presents PRIMG (Prioritization and Refinement Integrated Mutation-driven Generation), a framework designed to enhance the efficiency of test case generation for Solidity smart contracts. It addresses the computational overhead and redundancy associated with mutation testing by employing a novel combination of mutation prioritization and LLMs for generating and refining test cases.

PRIMG Framework Components

PRIMG integrates two core modules: mutation prioritization and test case generation. The mutation prioritization module predicts the utility of surviving mutants using machine learning models trained on mutant subsumption graphs, enabling developers to target high-impact mutants. The test case generation module utilizes LLMs to produce and iteratively refine tests that ensure syntactic and behavioral correctness.

Mutation Prioritization Module

The module employs Test Completeness Advancement Probability (TCAP) to prioritize mutants based on their potential to reveal additional errors when targeted. TCAP leverages the structural information of Dynamic Mutant Subsumption Graphs (DMSGs) to predict a mutant's usefulness, enhancing test effectiveness and minimizing test suite redundancy.

Test Case Generation Module

The generation process begins by creating an initial prompt that includes the Program Under Test (PUT), the mutant code, and a reference test file with diverse testing patterns. This prompt guides the LLM in generating an initial unit test, which is then refined through a syntax and behavior verification loop to ensure correctness.

Figure 1: The proposed approach for generating a test case using LLMs.

Experimental Evaluation

The evaluation of PRIMG was conducted on real-world Solidity projects from Code4Arena, demonstrating significant improvements in mutation scores while maintaining a compact test suite size. The effectiveness of PRIMG in reducing computational overhead and generating high-quality tests was assessed using the following criteria:

Test Case Correctness: The refining process significantly increased the syntactic and behavioral correctness rates of generated tests compared to single-shot prompts. The paper found a marked improvement when using a refining loop of five iterations, with no substantial gain observed beyond this point.
Performance of Prioritization Module: The prioritization module outperformed random mutant selection by consistently targeting high-impact mutants, resulting in an increased number of killed mutants and reducing redundant test efforts.
Figure 2: Overview of the dataset labeling process.

Implementation Details

The LLM used in this framework is a fine-tuned version of Llama 3.1 configured for effective test case generation and refinement. The mutation testing employed SuMo, a dedicated mutation testing tool for Solidity smart contracts, to generate and test mutants across various operators. Testing environments were established using the Hardhat framework and Ganache for deploying and executing contracts.

Developers executing this framework should ensure the correct setup of testing and deployment environments and consider training the machine learning models using project-specific data to maximize precision in mutation prioritization.

Conclusion

PRIMG demonstrates a scalable and efficient solution for test case generation by integrating sophisticated ML-based mutant prioritization with automated LLM-driven test generation and refinement. This dual approach not only optimizes test suites in size and computational efficiency but also enhances the capability to detect complex errors, thereby improving software quality significantly.

Figure 3: Number of test cases by trial and projects.

PDF Markdown

Paper Prompts

Explore 10 Community Prompts