Automated Variable Name Repair

Updated 7 December 2025

Variable name repair is the process of automatically recovering or generating meaningful identifiers in code, essential for clarity and maintenance.
Techniques span heuristic static analysis, n-gram models, and advanced transformer-based neural architectures to predict suitable variable names.
Empirical studies show that tailored models and reranking methods significantly improve exact and partial match rates, enhancing developer productivity.

Variable name repair is the automated task of recovering or generating meaningful variable identifiers in program source code where such names are missing, ambiguous, generic, or replaced by placeholders. This problem is central in software engineering due to the importance of expressive names for code comprehension, maintenance, tool support, and downstream machine learning models. The task spans multiple settings, including minified or obfuscated code, code refactoring, automated program repair, and decompilation, and has motivated a diverse range of techniques—from statistical LLMs and heuristic static analysis to transformer-based neural architectures and search-based inference with reranking.

1. Task Formulations and Evaluation Paradigms

Variable name repair typically involves presenting a code fragment with one or more variable names missing or replaced (e.g., all uses of a target variable x replaced with a placeholder token <ID₁> or [MASK]) and requires the system to predict a suitable replacement. The correctness of a repair is defined in several ways:

Exact Match (EM): The predicted name exactly matches the original or developer-chosen identifier.
Top-k Hit: The reference identifier appears among the k most probable candidates.
Partial-Match (PM): Embedding similarity or token-level overlap allows for near synonyms and variants (e.g., jsonValue vs. json).
Developer Ground Truth: Some studies rely on identifiers created during real development (from rename commits or code reviews) rather than just lexical equivalence.

For example, "Neural Variable Name Repair" defines the replacement task at function level: given a C++ function with a single identifier masked, generate a natural, descriptive name using only the code context (Yousuf et al., 30 Nov 2025). Other settings focus on minified JavaScript (Bavishi et al., 2018), variable-misuse bugs (Vasic et al., 2019), and decompiled binaries where no semantic names survive compilation (Banerjee et al., 2021, Xu et al., 2023).

2. Data Construction and Problem Scenarios

Empirical studies demonstrate that low-quality identifiers are widespread and detrimental to code comprehension. Datasets are created by mining large, real-world code corpora and employing the following strategies:

AST-guided Masking: Using parsers such as Tree-sitter to find local/parameter variables, each occurrence is systematically replaced by a placeholder (Yousuf et al., 30 Nov 2025).
Program Slicing: For minified or obfuscated code, all original names are replaced systematically, providing a controlled environment for name recovery (Bavishi et al., 2018).
Refactoring Logs: True variable names are extracted from rename refactoring commits or code review changes (Wang et al., 1 Jul 2025, Mastropaolo et al., 2022).
Decompilation: Aligning original source-level identifiers with decompiler output, using large-scale binaries with debug info (Banerjee et al., 2021, Xu et al., 2023).

These datasets may contain hundreds of thousands to millions of code examples, spanning multiple programming languages and code ecosystems.

3. Modeling Approaches

3.1 Statistical and Heuristic Methods

Early approaches employ n-gram LLMs with local caches and backoff smoothing, capitalizing on token-level regularities in code (Mastropaolo et al., 2022). Static analysis and pattern mining (as in VarNamer) use context extraction (e.g., initialization expressions, data types, homogeneity) and association-rule mining via FP-growth to recommend names based on code structure and project-wide conventions (Wang et al., 1 Jul 2025). These methods are lightweight and integrate into IDE workflows at low computational cost, but their vocabulary coverage and compositionality are limited compared to neural models.

3.2 Deep Learning and Transformer Architectures

Neural models for variable name repair leverage both sequence and context. Canonical designs include:

Encoder-Decoder Models: T5 (Mastropaolo et al., 2022) and LLMs such as Llama 3.1-8B (Yousuf et al., 30 Nov 2025) generate identifier suggestions conditioned on masked source code.
Contextual Embedding & Reranking: Dual-encoder systems embed both (a) the code context and (b) candidate identifiers, scoring the fit via cosine similarity with a contrastive loss, augmenting generative models with reranking for higher selection quality (Yousuf et al., 30 Nov 2025).
Pointer Networks: In variable-misuse repair, multi-headed pointer architectures localize misuse and select replacement variables by attending over token positions (Vasic et al., 2019).
Mask-Prediction with Subwords: Techniques such as VarBERT combine byte-pair encoding with constrained masked language modeling (CMLM), enabling open-vocabulary name generation by predicting subword sequences for each masked variable slot (Banerjee et al., 2021).

Some models incorporate external information for improved disambiguation—such as propagating predicted names from callers and callees in decompiled code (GenNm), or aligning output distributions with empirical developer naming patterns through KL-divergence regularization (Xu et al., 2023).

3.3 Static, Hybrid, and Data-Mining Recipes

VarNamer combines static analysis filters (homogeneous variables, structural and literal context similarity), data-mined association rules for naming, and a selection process that validates candidates by context congruence. This pipeline demonstrates strong gains over standard IDE heuristics, with approaches generalizing to languages beyond Java (notably C++) (Wang et al., 1 Jul 2025).

4. Experimental Results and Key Findings

Quantitative experiments consistently show superior performance for specialized neural architectures over statistical or naive approaches. Select results include:

Approach	Exact Match (%)	Partial Match (%)	Notes
Zero-shot Llama 3.1-8B (Yousuf et al., 30 Nov 2025)	6.1	37.4	C++, 200 ex.
LoRA-tuned Llama + rerank	46.0	84.5	C++, 200 ex.
Context2Name (Bavishi et al., 2018)	47.5	—	JS minified
VarBERT-Base (Banerjee et al., 2021)	86.4	—	C, binaries
GenNm Llama-34B (Xu et al., 2023)	61.2*	—	C, unseen split
CugLM (Mastropaolo et al., 2022)	63.5	—	Java, 400k ex.
VarNamer (Wang et al., 1 Jul 2025)	41.5	—	Java, refactoring
Eclipse (baseline)	27.2	—	Java, refactoring

* "Not-in-train" (unseen) split.

Key findings from these studies:

Task-specific fine-tuning, adapters (e.g., LoRA), and reranking consistently improve exact match and partial-match metrics over zero-shot LLMs and prompting.
Static analysis plus mined naming conventions yields large precision gains in IDE-centric variable repair and substantially reduces time and edits required in user studies (Wang et al., 1 Jul 2025).
Subword modeling and post-hoc length search address open-vocabulary identifier generation, critical for minified/obfuscated and decompiled scenarios (Banerjee et al., 2021).
Context injection and output distribution alignment (GenNm) further improve generalization to unseen code bodies and reduce spurious or bias-prone name generation (Xu et al., 2023).

5. Limitations and Failure Modes

Despite substantial progress, several failure patterns recur:

Semantic Ambiguity: Contexts with ambiguous semantics (e.g., generic loop indices) lower exact match.
Overfitting: Over-specialization to patterns in training corpora can misfire on out-of-distribution samples (Yousuf et al., 30 Nov 2025).
Vocabulary Gaps: Neural models with fixed vocabularies or capped output dictionaries struggle with rare or compositional identifiers (Mastropaolo et al., 2022).
Scope and Local Uniqueness: Some approaches may predict names already assigned elsewhere in the method, leading to collisions (Mastropaolo et al., 2022).
Count-of-Token Mismatch: Subword-based models face challenges in concatenating the correct number of pieces to form multi-token names (Banerjee et al., 2021).
Decompiled Code Bias: Recovery performance is affected by the decompiler used and the alignment of debug symbols; generalization to other architectures or heavily obfuscated binaries is not guaranteed (Banerjee et al., 2021, Xu et al., 2023).

6. Practical Impact and Integration

Automated variable name repair has demonstrated clear benefits for code readability, comprehension, and software maintenance:

Code assistants and IDE plugins (e.g., VS Code extensions, VarNamer in Eclipse) integrate these methods to provide real-time or refactoring-aware name suggestions, significantly improving developer productivity (27.8% speedup, 49.3% fewer edits) (Wang et al., 1 Jul 2025).
Name repair is essential for deobfuscating code—from minification reversal in JavaScript (Bavishi et al., 2018) to restoration in security-critical decompiled binaries (Banerjee et al., 2021, Xu et al., 2023).
Empirical studies on code review and refactoring logs show that current recommendations are often misaligned with developer preferences, motivating hybrid static + data-mining approaches (Wang et al., 1 Jul 2025, Mastropaolo et al., 2022).
Neural rerankers and LLM adapters are resource-efficient and can be deployed in lightweight form for large corpora, benefiting automated analysis, summarization, and vulnerability detection (Yousuf et al., 30 Nov 2025, Xu et al., 2023).

Emerging directions include multi-identifier renaming, graph-based reranking, project- or language-specific convention modeling, program-wide context integration, and robust cross-language generalization.

7. Outlook and Future Research

Current research underscores several promising avenues:

Multi-token, Multi-variable Contexts: Expanding repair tasks beyond a single identifier or name slot to simultaneous discovery and resolution.
Graph Neural Architectures: Enhanced flow- and type-aware models that can better capture semantic relationships within and across functions (Vasic et al., 2019).
User-in-the-loop Systems: Integrating suggestions in interactive development environments to capture realistic usage patterns and incremental improvements (Wang et al., 1 Jul 2025, Mastropaolo et al., 2022).
Cross-Language and Multilingual Modeling: Adapting repair systems to C++, Rust, and other ecosystems, requiring specialized frontends for AST and data-flow extraction (Wang et al., 1 Jul 2025, Xu et al., 2023).
Contextual Calibration and Uncertainty Estimation: Leveraging model confidence to focus automation on high-precision scenarios, while involving humans in ambiguous cases (Mastropaolo et al., 2022).

Variable name repair remains a crucial component in automated code understanding pipelines, bridging human-centric naming conventions with large-scale, data-driven software systems and LLMs.