NARRepair Model – Non-Autoregressive APR
- The paper demonstrates that NARRepair’s non-autoregressive approach significantly reduces inference latency while maintaining state-of-the-art patch fidelity across multiple benchmarks.
- The model integrates a repair action predictor and AST-based dependency extraction to ensure syntactic and semantic consistency in code repairs.
- It employs a two-stage decoding process to iteratively refine low-confidence token predictions, enhancing accuracy and practical repair performance.
The NARRepair Model is a non-autoregressive deep learning framework custom-built for automatic program repair (APR). Unlike conventional autoregressive approaches that generate patches token-by-token in sequence, NARRepair constructs program repairs in parallel, thereby achieving considerable reductions in inference latency while delivering state-of-the-art repair accuracy. The architecture integrates a repair action predictor, an inter-token dependency extractor based on source code ASTs, and a specialized two-stage decoder to address inherent challenges in parallel code generation. NARRepair has been validated across multiple APR benchmarks, demonstrating substantial speed advantages and high patch fidelity compared to both autoregressive and LLM–based systems (Yang et al., 2 Oct 2025, Yang et al., 24 Jun 2024).
1. Architectural Innovations
NARRepair's design centers around three core modules:
1. Repair Action Predictor
Rather than directly modeling the final code token at each position—a high-dimensional, error-prone task—NARRepair predicts a repair action for every token in the buggy code. The repair actions are classified as “keep” (retain the token), “replace” (substitute with a new token), “insert” (add tokens), and “delete” (remove the token). This prediction occurs jointly with the expected repair length for each location (e.g., “keep” and “replace” actions typically receive length 1, “delete” is length 0). The predictor leverages a convolutional network atop encoded token features, followed by fully connected layers. Formally:
The categorical losses for training are:
where is the input, and are model parameters.
By reducing prediction space to four discrete actions and associated lengths, NARRepair maximally preserves correct segments and sharply limits unnecessary modifications.
2. Inter-Token Dependency Extraction
Traditional non-autoregressive decoders lack explicit modeling of sequential dependencies, a critical weakness for syntax-sensitive domains like program repair. NARRepair augments token-level representations with structural dependency information derived from the program's Abstract Syntax Tree (AST). Using a program analysis tool (e.g., Tree-sitter), the nearest common parent node in the AST for every token pair is encoded as a dependency matrix. This structural information is integrated into token features using cross-attention:
This mechanism introduces AST-derived dependencies into the representations powering both action prediction and decoding, substantially enhancing syntactic and semantic consistency.
3. Two-Stage Decoding
To alleviate deficits in contextual representation intrinsic to parallel decoding, NARRepair employs a masked two-stage decoder:
- Stage 1: All tokens are decoded in parallel, producing a patch and associated confidence scores.
- Stage 2: Low-confidence tokens (either by prediction probability or repair action) are masked, and the decoder is rerun in a masked language-model fashion to refine uncertain regions using available context.
Let denote initial probabilities. The mask function replaces low-confidence entries with a [MASK] token. The second pass computes . The final loss used in training is
where and are module balancing coefficients (typically 0.1).
2. Integration of Non-Autoregressive Generation and APR Constraints
In contrast to autoregressive models that generate target patches token-by-token in left-to-right order (with latent representations accumulating sequential context), NARRepair's non-autoregressive approach supports fully parallel generation. This is only feasible through the action prediction strategy, dependency injection from AST structures, and subsequent contextual refinement. Such integration is required because program repair tasks are extremely sensitive to syntactic correctness, semantic preservation, and localized patching. Each module is architected to directly address characteristic deficits of naive fully-parallel generative models.
3. Empirical Evaluation
NARRepair has been thoroughly evaluated on three benchmark datasets:
| Dataset | # Bugs | NARRepair (3min limit) | AR-based SOTA | GPU Speedup |
|---|---|---|---|---|
| Defects4J v1.2 | 395 | Up to 35 | Lower | 1.4–6.4× |
| Defects4J v2.0 | 420+ | Higher | Lower | |
| QuixBugs | 40 | Higher | Lower |
NARRepair outperforms baselines—including autoregressive and sequence-to-sequence LLM-based systems—in both repair speed and patch success rate under realistic time constraints (Yang et al., 2 Oct 2025, Yang et al., 24 Jun 2024). Its accuracy remains robust under imperfect fault localization and short inference intervals, a critical requirement in practical deployment settings.
4. Impact on Program Repair Research and Practice
The introduction of NARRepair addresses previously limiting trade-offs in APR research between patch quality and inference latency. The approach enables rapid, high-quality bug fixing suitable for latency-sensitive applications such as interactive development environments, CI/CD pipelines, and embedded system maintenance. The modular innovation—action-based patching, AST-driven dependency representation, and staged decoding—sets a new direction for integrating explicit syntactic knowledge and flexible generative architectures in program repair. Speed improvements of 1.4–6.4× are typical, with evidence of even greater gains on larger models and hardware acceleration (Yang et al., 2 Oct 2025).
5. Relationship to Related Model Repair Frameworks
NARRepair differs fundamentally from methods such as static analysis-driven repair, graph transformation approaches, or neuron-level LM editing. The focus is on non-autoregressive generative paradigms within the APR domain. While abstraction-driven model repair (e.g., KMTS/CTL repair) (Chatzieleftheriou et al., 2015) and rule-based graph repair (Sandmann et al., 2019) offer guarantees and systematic synthesis for generalized system models, they do not address the specific computational and accuracy constraints of code patch synthesis via deep learning, nor do they scale to real-time patch generation tasks. NARRepair’s use of ASTs as dependency scaffolding is tailored for syntactic programming languages and is orthogonal to approaches centered on semantic subtyping or symbolic reasoning.
6. Future Directions
The core innovations of NARRepair—parallelized patch synthesis via action prediction, AST-guided dependency extraction, and iterative context refinement—suggest several research opportunities. These include extending the framework for multi-hunk and cross-file repairs, integrating upstream fault localization signals, and merging with semantics-based neuron editing or symbolic verification for correctness guarantees. Additionally, the approach may generalize to repair tasks in compiled, low-level, or dynamically typed languages where localized patching is essential.
7. Key Formula Table
| Component | Formula / Expression | Description |
|---|---|---|
| Encoder | Token encoding | |
| Repair Action Prediction | <br><br> | Predicting repair action and length |
| Dependency Extraction | <br><br><br> | AST-based inter-token dependency |
| Two-Stage Decoding | <br><br> | Parallel preliminary + masked contextual pass |
| Loss | Joint module loss with balancing coefficients |
These architectural components collectively deliver state-of-the-art repair accuracy and repair latency in APR.
References
- "Towards Speeding up Program Repair with Non-Autoregressive Model" (Yang et al., 2 Oct 2025)
- "NARRepair: Non-Autoregressive Code Generation Model for Automatic Program Repair" (Yang et al., 24 Jun 2024)