ENCRUST: Encapsulated Substitution and Agentic Refinement on a Live Scaffold for Safe C-to-Rust Translation

Published 6 Apr 2026 in cs.SE, cs.AI, and cs.PL | (2604.04527v1)

Abstract: We present Encapsulated Substitution and Agentic Refinement on a Live Scaffold for Safe C-to-Rust Translation, a two-phase pipeline for translating real-world C projects to safe Rust. Existing approaches either produce unsafe output without memory-safety guarantees or translate functions in isolation, failing to detect cross-unit type mismatches or handle unsafe constructs requiring whole-program reasoning. Furthermore, function-level LLM pipelines require coordinated caller updates when type signatures change, while project-scale systems often fail to produce compilable output under real-world dependency complexity. Encrust addresses these limitations by decoupling boundary adaptation from function logic via an Application Binary Interface (ABI)-preserving wrapper pattern and validating each intermediate state against the integrated codebase. Phase 1 (Encapsulated Substitution) translates each function using an ABI-preserving wrapper that splits it into two components: a caller-transparent shim retaining the original raw-pointer signature, and a safe inner function targeted by the LLM with a clean, scope-limited prompt. This enables independent per-function type changes with automatic rollback on failure, without coordinated caller updates. A deterministic, type-directed wrapper elimination pass then removes wrappers after successful translation. Phase 2 (Agentic Refinement) resolves unsafe constructs beyond per-function scope, including static mut globals, skipped wrapper pairs, and failed translations, using an LLM agent operating on the whole codebase under a baseline-aware verification gate. We evaluate Encrust on 7 GNU Coreutils programs and 8 libraries from the Laertes benchmark, showing substantial unsafe-construct reduction across all 15 programs while maintaining full test-vector correctness.

Abstract PDF Upgrade to Chat

Authors (5)

Summary

The paper introduces a two-phase LLM translation pipeline that encapsulates function-level safety through ABI-preserving wrappers and type-directed rewriting.
It leverages an agentic refinement phase with a 17-tool loop to address global unsafety, achieving significant reductions in pointer dereferences and unsafe code.
Empirical evaluation on GNU Coreutils and Laertes shows 100% test-vector pass rates and improved idiomatic Rust quality over previous approaches.

ENCRUST: An LLM-Centric, Scaffolded Framework for Safe C-to-Rust Translation

Introduction and Motivation

The translation of legacy C systems code to memory-safe Rust is a critical open problem due to C's lack of explicit safety guarantees and its prevalence in security-sensitive domains. Traditional C-to-Rust transpilation pipelines, such as C2Rust, output Rust code that simply mirrors C's pointer- and memory-unsafe constructs—yielding functionally correct but unsafe Rust. Attempts at rule-based post-processing or incremental transformation, e.g., Laertes and Crown, only partially address pointer-related unsafety, fail at semantic translation, and do not generalize to complex project-wide cross-cutting changes. LLM-based approaches showed promise but encountered previously insurmountable scaling issues, particularly in call-site adaptation and cross-unit dependency management.

ENCRUST introduces a fundamentally decoupled two-phase pipeline that orchestrates LLM translation and codebase refinement under continuous behavioral verification, addressing both local and global translation failures endemic in prior work.

ENCRUST Architecture: Two-Phase Translation Pipeline

ENCRUST is structured as a two-phase pipeline, each phase explicitly targeting different classes of translation obstacles.

Phase 1: Encapsulated Substitution

The first phase addresses function translation modularity and ABI stability through the introduction of an ABI-preserving wrapper pattern. For each function, the system extracts its logic from the boundary adaptation concerns:

Wrapper/Safe Pair Generation: Every C function $f$ with signature $T_1 \times \dots \times T_n \rightarrow T_r$ is replaced with a pair $f$ (the outer wrapper, preserving external naming and ABI) and $f_\mathit{safe}$ (the LLM-generated safe function using idiomatic Rust types).
Compile-Test–Gated Loop: An LLM is invoked with contextually scoped prompt material (original C source, prior translations, callee signatures), producing candidate wrapper/safe pairs. Every candidate undergoes compilation and test-vector validation before being committed, ensuring the "Live Scaffold Invariant"—the workspace remains compiling and passing throughout translation.
Type-Directed Wrapper Elimination (TDWE): Once function translation completes, an automated pass rewrites all call sites to invoke safe inner functions directly, eliminating the wrapper indirection except in a small set of structurally ambiguous cases, thus converging towards an idiomatic, unsafe-free interface.
Figure 1: ENCRUST's two-phase pipeline: phase 1 employs function-level wrappers and automatic, type-directed call site rewriting; phase 2 applies agentic program-wide refinement using a 17-tool agentic loop, closing safety gaps.

Phase 2: Agentic Refinement

The second phase is designed to resolve unsafety that inherently escapes per-function scope:

Task Discovery and Tool-Equipped Agentic Loop: Using static analysis, ENCRUST identifies static mut globals, skipped wrappers, failed struct translations, and functions outside retry budgets. Each unresolved unsafe pattern is converted to an explicit translation or migration task.
17-Tool Agent Suite: The system equips an LLM with 17 navigation, modification, analysis, and verification tools, including source traversal, batch rewriting, semantic linkage, and a compile-and-test verification gate. For each task, an agentic loop operates until behavioral correctness (relative to a pre-recorded test baseline) is achieved.
Checkpointing and Safe Rollback: All edits are auto-snapshotted before every agentic task; the system supports automatic rollback on agentic failure or iteration budget exhaustion, ensuring no workspace corruption.

Safety Preservation and Code Quality

ENCRUST’s most distinctive feature is that translation is always checked at the codebase level, not in isolation. This overcomes the semantic drift and dependency mismatches characteristic of prior LLM and rule-based pipelines. Struct migration is supported via a dual-struct abstraction, sidestepping pointer-based aliasing pitfalls and eliminating use-after-free errors endemic to naive pointer ownership translation.

The pipeline maintains precise safety metrics:

Raw pointer declarations and dereferences: ENCRUST achieves up to a 57% reduction in pointer dereferences and a 44% reduction in declarations on Coreutils.
Unsafe lines of code and casts: Over both Coreutils and Laertes, ENCRUST reduces unsafe code lines by ~38% and unsafe casts by up to 60% over the C2Rust baseline.
Idiomaticity: As measured by Clippy warning count, ENCRUST's final code is less noisy and more idiomatic than prior LLM-based approaches; the idiomatic gap with best-effort manual Rust code remains attributed to remaining complex pointer idioms and FFI vestiges.

Empirical Evaluation

ENCRUST is evaluated on 197,706 lines of code across 7 GNU Coreutils and 8 Laertes libraries (totaling 2,366 functions):

Correctness (Test-Vector Pass Rate): Maintains a strict 100% pass rate on all covered inputs for all benchmarks, assured by compile-and-test gates at every execution stage.
Function Translation Scalability: Achieves a function-level compile pass rate of up to 99.2% on certain Coreutils targets, with remaining failures handled via agentic Phase 2 or safely retained as legacy stubs.
Completeness vs Prior Work: Unlike EvoC2Rust and similar LLM baselines, which routinely fail to produce compiling whole-project outputs, ENCRUST guarantees a compiling and test-passing crate at every intermediate milestone.

Implications, Limitations, and Future Directions

The ENCRUST framework demonstrates the viability of structured, agentic LLM translation for migration of legacy C to safe Rust with project-scale behavioral preservation. Practically, it provides a reproducible pipeline—one that supports safe incremental migration without downtime, enables auditability through persistent behavioral correctness, and pushes LLMs into tractable, well-scoped tasks for which they are known to perform best.

Key limitations include verification coverage (relies entirely on test-vector suite coverage; behaviors not exercised are not guaranteed), and best-effort TDWE removal in phase 1, meaning not all residual unsafety is eliminated if not surfaced in verification. Some highly pointer-centric patterns and complex ABI-dependent signatures remain outside fully-automated safe translation. The agentic loop’s completion rate is ~70% over all tasks, indicating further scope for robustness improvement.

Potential future advances include enhanced coverage-oriented test generation, hybrid symbolic-execution-augmented verification, extension to inline assembly handling, and the substitution of open-weights LLMs for improved reproducibility.

Conclusion

ENCRUST provides a modular, test-driven, LLM- and agentic-refinement based migration framework capable of translating large real-world C projects to behaviorally equivalent, memory-safe Rust. By separating per-function encapsulated translation from program-wide verification and refinement, and enforcing correctness at every step, ENCRUST closes critical automation gaps in safe systems language migration. The pipeline constitutes a practical and theoretically sound approach for organizations seeking to eliminate entire classes of memory safety bugs from legacy code with the assistance of automated program transformation.

Markdown Report Issue