SRI-Coder Series: Single-Pass Code Infilling
- SRI-Coder Series is a family of large language models that use a search-and-replace infilling mechanism to generate explicit ‘search’ and ‘replace’ blocks in a single pass.
- It leverages a curated 200K dataset from The Stack v2 with patch-style supervision, ensuring balanced, context-aware training for practical code editing.
- The models deliver significant improvements in accuracy and efficiency with reduced inference latency and enhanced safety compared to traditional Fill-in-the-Middle approaches.
The SRI-Coder Series designates a family of LLMs for source code infilling and editing, based on the Search-and-Replace Infilling (SRI) framework. SRI-Coder models advance beyond the prevalent Fill-in-the-Middle (FIM) paradigm by operationalizing a single-pass, context-aware verification-and-editing process. This is enabled by a structured output of explicit “search” and “replace” blocks, harmonizing instruction-following capabilities of chat-oriented models with the efficiency and practicality required in automated code completion and assisted development tasks (Zhang et al., 19 Jan 2026).
1. Formalization of Search-and-Replace Infilling
Classic Fill-in-the-Middle (FIM) infilling generates a missing code segment by maximizing conditional likelihood given the prefix and suffix context:
SRI recasts infilling as an integrated “search” and “replace” operation executed in a single generative pass. Given context , the model emits a sequence of the form:
where designates the lines to be edited (“search block”), and is the corrected or completed code (“replacement block”). The formal objective is:
implicitly reducible to a single . This explicit diff-style formalism enables SRI-Coder models to identify and correct context-sensitive errors while maintaining the efficiency of single-pass inference (Zhang et al., 19 Jan 2026).
2. Dataset Construction and Patch-Style Supervision
The SRI-200K dataset underpins SRI-Coder pretraining. It is derived from "The Stack v2" via tree-sitter, extracting logical AST blocks such as function bodies, loops, single lines, and random spans. The curation ensures 200,000 examples with balanced distribution: function bodies (2 units), multi-line blocks (1), random spans (1), and single lines (1). Fine-tuning uses a 20,000-sample high-star-count subset, with the remainder employed for ablation studies.
Each sample consists of a code file featuring a single /* MIDDLE [CODE](https://www.emergentmind.com/topics/chaosode-code) TO COMPLETE */ marker. The label is a unified patch:
1 2 3 4 5 |
<<<<<<< SEARCH // 10 lines of context including the marker ======= // same 10 lines but with ground-truth code >>>>>>> REPLACE |
3. Architectural Design and Training Procedure
SRI-Coder builds on Qwen2.5-Coder and Qwen3-Coder checkpoints, covering a wide parameter range: 0.5B, 1.5B, 3B, 7B, 14B, 32B, 30B, 235B, and 480B. No additional layers or adapters are introduced; the Transformer backbone with SwiGLU activations and RMSNorm remains unchanged. The tokenizer is inherited from Qwen-Coder.
Key training features include:
- Data mixture: 20,000 SRI patches + 60,000 general code-assistant instructions (Glaive-Code-Assistant) + 100 safety alignments.
- Instruction-tuning by minimizing standard cross-entropy loss:
with as the ground-truth search-and-replace diff.
- Optimization: Adam (weight decay 0.1, gradient clipping 1.0), BF16 precision, 16×A100 GPUs, initial LR with scheduled decay.
- Prompt conventions: input marker
/* MIDDLE CODE TO COMPLETE */, output delimiters<<<<<<< SEARCH,=======, and>>>>>>> REPLACE(Zhang et al., 19 Jan 2026).
4. Evaluation Methodology and Performance Analysis
SRI-Coder models are benchmarked against base models (standard FIM), natural-language FIM variants (standard, dialogue, template prompts), and a spectrum of proprietary/open LLMs (GPT-4, Claude, Gemini, DeepSeek, Qwen3, Grok):
- Similarity-based benchmarks: Exact Match (EM) and Edit Similarity (ES) on CrossCodeEval, RepoEval, CrossCodeLongEval.
- Execution-based: Pass@1 and unit-test success on ExecRepoBench and SAFIM.
- Latency: average inference time.
Empirical findings:
- SRI-Coder achieves up to +20 EM/ES over NL-FIM chat baselines.
- SRI-Coder-32B surpasses its Qwen2.5-Coder-32B base by +46.3 EM (CrossCodeEval) and +49.3 ES (CrossCodeLongEval).
- Single-pass SRI-Coder matches FIM latency to within a few percent, remaining significantly faster than agentic multi-step tools.
- SRI-Coder retains general coding competencies, with no observed pass@1 drop on MBPP, HumanEval, BigCodeBench, or LiveCodeBench—a marked improvement over NL-FIM fine-tuning which causes a 5–10 point loss (Zhang et al., 19 Jan 2026).
5. Algorithmic Workflow of SRI Inference
The SRI paradigm consists of four principal steps executed in a single-shot generative manner, as shown in the following high-level pseudocode:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
function SRI_Infill(file_contents):
# 1. Locate marker and extract 10-line window W
p, marker, s = slice_around_marker(file_contents, window=10)
prompt = format_system_prompt() + p + marker + s
# 2. Model generates diff in one shot
diff_output = Model.generate(prompt)
# 3. Parse SEARCH and REPLACE blocks
search_block, replace_block = parse_diff(diff_output)
# 4. Optionally convert to standard git patch or apply directly
patched_file = apply_search_replace(p, s, search_block, replace_block)
return patched_file |
6. Security Alignment, Limitations, and Prospective Directions
SRI-Coder inherits safety benefits from instruction-following chat models: the SAL benchmark attack success rate drops from approximately 100% (base, unaligned) to below 20% (Table 1), significantly improving over FIM models. Inference latency remains competitive, and SRI-Coder exhibits robust generalization on unseen coding tasks.
Limitations include current evaluation being offline and absence of real-world IDE/user studies, as well as modest gains on models ≤1.5B parameters. This suggests that effective transfer of the SRI paradigm to lightweight architectures may require knowledge distillation or curriculum learning. The SRI-Coder series is slated for open-source release and integration into code editors for iterative community-driven enhancement (Zhang et al., 19 Jan 2026).