Can small, quantized open-source LLMs effectively perform C-to-Java translation?

Determine whether small, quantized, open-source large language models with fewer than 20 billion parameters can effectively perform automated translation of C code to Java code, producing runnable and semantically correct Java across challenging C idioms (e.g., pointers, unions, and goto refactoring) in edge and on-premise settings.

Background

The paper motivates the need for local, privacy-preserving models by noting that many organizations cannot use large proprietary cloud models. This raises the question of whether small, quantized models can handle the complex semantic gap between C and Java.

To probe this, the authors evaluate 19 sub-20B parameter LLMs on a 20-case benchmark focused on difficult C idioms and use a hybrid AST-plus-guardrail prompting pipeline. Results show a stark tiered performance, with most models failing completely and only three models passing more than half the tests, indicating partial but limited capability.

References

This reality leads to a critical, unanswered question, can small, quantized, open source LLMs (e.g., Edge LLMs with fewer than 20 billion parameters) effectively handle this complex translation task?

— REMODEL-LLM: Transforming C code to Java using LLMs (2512.11402 - Gupta et al., 12 Dec 2025) in Introduction (Section 1)

Can small, quantized open-source LLMs effectively perform C-to-Java translation?

Background

References

Related Problems