Scaling REMODEL-LLM from isolated files to full interdependent C projects

Determine how to scale the REMODEL-LLM pipeline from isolated single-file translations to full, interdependent C codebases that rely on multi-file repositories, header files, macros, external definitions, and build systems, while preserving semantic correctness and functional equivalence in the translated Java code.

Background

The study uses a single-file micro-benchmark to isolate specific C idioms. The authors note that real-world C projects involve multi-file repositories with complex build systems and cross-file dependencies, which their current pipeline does not yet handle.

They explicitly frame the extension from isolated tasks to end-to-end project-scale translation as a significant, unresolved challenge.

References

We have established a clear performance ceiling on these isolated problems, but scaling this approach to a full, interdependent C project remains a significant and open challenge.

REMODEL-LLM: Transforming C code to Java using LLMs  (2512.11402 - Gupta et al., 12 Dec 2025) in Conclusion (Threats to External Validity paragraph, Section 6)