Self-Constructed Context Decompilation (sc²dec)
- Self-Constructed Context Decompilation (sc²dec) is a neural method that reconstructs high-level source code from binaries by leveraging in-context learning with recompilable code segments.
- It employs an iterative pipeline that includes initial decompilation, recompilation, and fine-grained alignment using compiler-generated metadata to bridge assembly with source code.
- Empirical evaluations demonstrate that sc²dec and its extension, SALT4Decompile, significantly improve re-compilability, re-executability, and test-case pass rates over traditional decompilers.
Self-Constructed Context Decompilation (sc²dec) is a neural decompilation methodology designed to enhance the process of recovering high-level source code from compiled binaries or assembly, particularly under circumstances where neither source code nor rich symbol information is available. Distinct from approaches that rely on scaling model size or training data, sc²dec improves functional and semantic accuracy by leveraging the intrinsic properties of decompilation: the partial recompilability of LLM-generated code and the ability to use compiler-generated metadata to align assembly with source statements. The methodology has demonstrated state-of-the-art empirical performance and serves as a foundation for advances in LLM-guided reverse engineering, including structured logic abstraction and robust error correction mechanisms (Feng et al., 2024, Wang et al., 18 Sep 2025).
1. Motivation and Core Problem in Decompilation
Decompilation seeks to reconstruct readable, functionally equivalent high-level code from assembly or binaries—a task fundamentally hindered by the loss of symbolic names, comments, and high-level structures due to compiler optimizations, register allocation, and code rewriting. Traditional rule-based decompilers such as IDA Pro and Ghidra apply control-flow graph (CFG) recovery and heuristic rewriting rules, but their outputs are frequently low in readability and often fail to generalize over heavily optimized input (Feng et al., 2024).
Neural decompilation methods have been advanced by scaling LLMs or generating massive synthetic corpora, but these achieve diminishing returns due to the absence of effective mechanisms for incorporating problem-intrinsic structural information. sc²dec was developed to directly exploit two salient properties:
- LLM decompilation outputs commonly produce locally compilable code,
- Debugging metadata produced during (re-)compilation supports granular statement-level alignment between assembly and source.
This enables an in-context learning approach, orthogonal to traditional scaling and fine-tuning, which dynamically constructs relevant demonstration pairs at inference time.
2. Methodological Foundations of sc²dec
2.1. In-Context Self-Construction Pipeline
The sc²dec pipeline operates as follows:
- Initial Decompilation: Input assembly is decompiled via an LLM, producing a candidate source program .
- Compilation and Disassembly: If compiles successfully, it is recompiled with matching toolchain and flags, yielding re-generated assembly .
- Context Demo Construction: A demo pair is formed.
- Refined Decompilation: The LLM is re-invoked on the concatenated context, , producing a refined output .
When multiple recompilable pairs are available, a scoring function selects the most relevant demos using a convex combination of BLEU score and re-execution match: where indicates binary success on unit tests. The top- pairs by 0 are prepended as demonstrations (Feng et al., 2024).
Pseudocode illustrating the process:
4
2.2. Fine-Grained Alignment Enhancement (FAE)
FAE improves LLM fine-tuning by enforcing statement-level correspondence between source code and assembly. During fine-tuning, DWARF debug metadata and objdump -S are used to align blocks, enabling both end-to-end and step-by-step decompilation objectives: 1
2
The overall fine-tuning objective is combined as: 3
Training employs 10,000 C functions sampled from Exebench, compiled under various optimization levels with DWARF information, and disassembled to obtain aligned statement pairs (Feng et al., 2024).
3. Advanced sc²dec Realizations: Structured Logic-Driven Integration
SALT4Decompile exemplifies a structured extension of the self-constructed context paradigm via the Source-level Abstract Logic Tree (SALT) (Wang et al., 18 Sep 2025). SALT abstracts essential control-flow and data-access features from assembly, mapping CFG elements and instruction motifs to tree-structured logic nodes (e.g., LOOP, IF, CALL, LOAD_CONST, SEQ).
The algorithmic sequence for SALT extraction includes CFG construction, address normalization, loop/branch inference, and recursive tree building:
- Back-edges in the CFG denote LOOP nodes,
- Conditional jumps establish IF nodes,
- Function call instructions are mapped to CALL nodes,
- Sequential groups inhabit SEQ leaves.
The linearized SALT tree is serialized as an LLM input template, and the LLM is fine-tuned to map SALT representations to source code under a standard cross-entropy loss. This process is further buttressed by post-processing modules, including automated compilation error repair, loop boundary fixes, and variable/comment recovery, collectively improving both functional recovery and output readability (Wang et al., 18 Sep 2025).
4. Empirical Performance and Evaluation
Performance is primarily benchmarked on the Decompile-Eval suite, adapted from HumanEval, with metrics including:
- Re-Compilability: Fraction of decompiled outputs that compile.
- Re-Executability: Fraction that compiles and passes all functional unit tests.
- Test-Case Pass Rate (TCP): Per-function pass fraction.
Key quantitative results (averaged across optimization levels) (Feng et al., 2024, Wang et al., 18 Sep 2025):
| Method | Re-Compilability | Re-Executability | TCP |
|---|---|---|---|
| llm4decompile-6.7b | 0.928 | 0.498 | 0.598 |
| + sc²dec | 0.952 | 0.515 | 0.632 |
| + FAE | 0.951 | 0.523 | 0.650 |
| + sc²dec & FAE | 0.955 | 0.550 | 0.684 |
| SALT4Decompile | 0.968 | 0.587 | 0.704 |
sc²dec alone yields 3–4 percentage point improvement in re-executability, while FAE contributes ~4.6 pp; combined, they exhibit super-additive gains. SALT4Decompile demonstrates further gains over sc²dec+FAE with a test-case pass rate uplift of +10.6 pp.
5. Robustness, Failure Modes, and User Study Insights
sc²dec’s improvements are robust to moderate changes in context-building (e.g., compiler variants and optimization mismatches cost only 1–2 pp). Its reliance on recompilability constitutes a known point of failure: if the initial LLM output fails to compile, the method cannot progress beyond the first candidate.
SALT4Decompile exhibits explicit resilience to code obfuscation, outperforming prior LLM baselines under four common LLVM-based transformations with a minimum +5 pp TCP margin per function and obfuscation combination. User studies with professional and novice reverse engineers consistently rate logic-driven decompiled outputs as more comprehensible and feature richer semantic cues over baseline LLM or traditional decompiler results (Wang et al., 18 Sep 2025).
6. Limitations and Prospective Enhancements
sc²dec is constrained by:
- Dependence on LLM output recompilability,
- Data limitations in FAE (10,000 function corpus and four optimization levels),
- Potential decline in one-shot capabilities with aggressive statement-level fine-tuning.
Potential directions for enhancement include:
- Expanding the set of self-constructed contexts, including partially recompilable code,
- Scaling training to more diverse code bases,
- Jointly optimizing the formatting of in-context demonstrations to maintain model generality,
- Dynamically adapting context scoring parameters according to input complexity (Feng et al., 2024).
7. Broader Impact and Related Work
Self-Constructed Context Decompilation (sc²dec) signifies a methodological progression in the decompilation domain, embracing inference-time demonstration construction and alignment-based LLM tuning. Its principles underpin advanced systems such as SALT4Decompile, which leverage logic-structured abstraction for semantic recovery and functional robustness, validated across extensive benchmarks and human-factor studies. Evidence suggests that this paradigm not only outperforms prior SOTA in quantitative metrics but also enhances qualitative aspects of analyst comprehension, highlighting its significance for both the automated and human-in-the-loop reverse engineering ecosystem (Feng et al., 2024, Wang et al., 18 Sep 2025).