Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing Guarantees (2506.14606v1)

Published 17 Jun 2025 in cs.CL, cs.AR, cs.LG, cs.PL, and cs.SE

Abstract: The hardware ecosystem is rapidly evolving, with increasing interest in translating low-level programs across different instruction set architectures (ISAs) in a quick, flexible, and correct way to enhance the portability and longevity of existing code. A particularly challenging class of this transpilation problem is translating between complex- (CISC) and reduced- (RISC) hardware architectures, due to fundamental differences in instruction complexity, memory models, and execution paradigms. In this work, we introduce GG (Guaranteed Guess), an ISA-centric transpilation pipeline that combines the translation power of pre-trained LLMs with the rigor of established software testing constructs. Our method generates candidate translations using an LLM from one ISA to another, and embeds such translations within a software-testing framework to build quantifiable confidence in the translation. We evaluate our GG approach over two diverse datasets, enforce high code coverage (>98%) across unit tests, and achieve functional/semantic correctness of 99% on HumanEval programs and 49% on BringupBench programs, respectively. Further, we compare our approach to the state-of-the-art Rosetta 2 framework on Apple Silicon, showcasing 1.73x faster runtime performance, 1.47x better energy efficiency, and 2.41x better memory usage for our transpiled code, demonstrating the effectiveness of GG for real-world CISC-to-RISC translation tasks. We will open-source our codes, data, models, and benchmarks to establish a common foundation for ISA-level code translation research.

Summary

Overview of "Guaranteed Guess: A LLMing Approach for CISC-to-RISC Transpilation with Testing Guarantees"

The paper presents an innovative approach to address the complexities inherent in transpiling code from Complex Instruction Set Computing (CISC) architectures to Reduced Instruction Set Computing (RISC) architectures. As the hardware landscape transitions towards more efficient and performance-oriented designs, particularly with the increasing prevalence of RISC architectures like ARM in data centers, there exists a critical need to accurately and efficiently translate legacy CISC code into RISC formats. This need is intensified by the constraints of existing runtime emulation solutions, such as Apple's Rosetta 2, which introduce performance and memory overheads.

Guaranteed Guess Approach

The authors propose "Guaranteed Guess" - a novel methodology that leverages the predictive capabilities of LLMs, like custom-trained versions of DeepSeek and Qwen, to generate assembly language translations between different ISAs. This approach not only focuses on generating accurate code translations but embeds the translations within a software testing framework. This integration aims to ensure both syntactic and semantic correctness, providing quantifiable testing guarantees for the transpiled code.

Methodology and Results

The paper details a robust data collection and model training process. Leveraging large-scale datasets from AnghaBench and The Stack, the authors train their LLMs with architectural extensions to understand and predict assembly code semantics effectively. The introduction of an enhanced tokenizer, tuned to recognize common opcodes and register names from targeted ISA families, further aids the model's predictive accuracy.

Evaluation results demonstrate significant improvements over existing models and emulation systems. The Guaranteed Guess approach achieves 99.39% functional correctness on ARMv8 targets using HumanEval benchmarks and substantial efficiency gains compared to Rosetta 2 – showing a 1.73 times faster runtime performance and a 2.41 times better memory usage profile. On the BringUpBench dataset, which presents real-world program complexity, the approach attains a 49.23% accuracy, highlighting the challenges posed by heavily optimized binaries (-O2) which obscure direct transpilation due to intricate data and control flow transformations.

Implications and Speculations

The implications of this work are multi-fold. Practically, this approach provides a scalable means of converting legacy binaries into efficient RISC formats, directly addressing industry needs where source code recompilation is infeasible. Theoretically, the paper opens new avenues in applying LLMs to assembly-level code transformation tasks, bridging architectural execution model differences through learned representations.

Moving forward, this work paves the way for further research into integrating symbolic reasoning with neural approaches to handle aggressive compiler optimizations. Future developments may explore greater context window usage or hybrid symbolic-neural models to enhance semantic preservation across transformations.

Conclusion

In summary, this paper offers a promising solution to the enduring challenge of CISC-to-RISC transpilation. By melding predictive machine learning techniques with rigorous software testing protocols, Guaranteed Guess sets a precedent for future exploration in ISA-centric code translation, contributing both to computational linguistics and practical software engineering domains.

Tweets

https://twitter.com/Underfox3/status/1937234413114524052

https://twitter.com/PLpreprintBot/status/1935299439910076618