Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DeepMutants: Training neural bug detectors with contextual mutations (2107.06657v1)

Published 14 Jul 2021 in cs.SE and cs.LG

Abstract: Learning-based bug detectors promise to find bugs in large code bases by exploiting natural hints such as names of variables and functions or comments. Still, existing techniques tend to underperform when presented with realistic bugs. We believe bug detector learning to currently suffer from a lack of realistic defective training examples. In fact, real world bugs are scarce which has driven existing methods to train on artificially created and mostly unrealistic mutants. In this work, we propose a novel contextual mutation operator which incorporates knowledge about the mutation context to dynamically inject natural and more realistic faults into code. Our approach employs a masked LLM to produce a context-dependent distribution over feasible token replacements. The evaluation shows that sampling from a LLM does not only produce mutants which more accurately represent real bugs but also lead to better performing bug detectors, both on artificial benchmarks and on real world source code.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Cedric Richter (5 papers)
  2. Heike Wehrheim (27 papers)
Citations (3)