EFACT: an External Function Auto-Completion Tool to Strengthen Static Binary Lifting
Abstract: Static binary lifting is essential in binary rewriting frameworks. Existing tools overlook the impact of External Function Completion (EXFC) in static binary lifting. EXFC recovers the prototypes of External Functions (EXFs, functions defined in standard shared libraries) using only the function symbols available. Incorrect EXFC can misinterpret the source binary, or cause memory overflows in static binary translation, which eventually results in program crashes. Notably, existing tools struggle to recover the prototypes of mangled EXFs originating from binaries compiled from C++. Moreover, they require time-consuming manual processing to support new libraries. This paper presents EFACT, an External Function Auto-Completion Tool for static binary lifting. Our EXF recovery algorithm better recovers the prototypes of mangled EXFs, particularly addressing the template specialization mechanism in C++. EFACT is designed as a lightweight plugin to strengthen other static binary rewriting frameworks in EXFC. Our evaluation shows that EFACT outperforms RetDec and McSema in mangled EXF recovery by 96.4% and 97.3% on SPEC CPU 2017. Furthermore, we delve deeper into static binary translation and address several cross-ISA EXFC problems. When integrated with McSema, EFACT correctly translates 36.7% more benchmarks from x86-64 to x86-64 and 93.6% more from x86-64 to AArch64 than McSema alone on EEMBC.
- From hack to elaborate technique - A survey on binary rewriting. ACM Comput Surv 2019;52(3):49:1–49:37.
- BAP: A binary analysis platform. In: Gopalakrishnan, G., Qadeer, S., editors. Computer Aided Verification - 23rd International Conference, CAV 2011, Snowbird, UT, USA, July 14-20, 2011. Proceedings; vol. 6806 of Lecture Notes in Computer Science. Springer; 2011, p. 463–469.
- Formally verified lifting of c-compiled x86-64 binaries. In: Jhala, R., Dillig, I., editors. PLDI ’22: 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, San Diego, CA, USA, June 13 - 17, 2022. ACM; 2022, p. 934–949.
- Update with care: Testing candidate bug fixes and integrating selective updates through binary rewriting. J Syst Softw 2022;191:111381.
- SINOF: A dynamic-static combined framework for dynamic binary translation. J Syst Archit 2012;58(8):305–317.
- Optimizing data permutations in structured loads/stores translation and SIMD register mapping for a cross-isa dynamic binary translator. J Syst Archit 2019;98:173–190.
- WDBT: non-volatile memory wear characterization and mitigation for DBT systems. J Syst Softw 2022;187:111247.
- Dynamo: a transparent dynamic optimization system. In: Lam, M.S., editor. Proceedings of the 2000 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Vancouver, Britith Columbia, Canada, June 18-21, 2000. ACM; 2000, p. 1–12.
- Dyninst. 2023. URL: https://www.dyninst.org/.
- Binary optimization using hybrid grey wolf optimization for feature selection. IEEE Access 2019;7:39496–39508.
- rev.ng: a unified binary analysis framework to recover cfgs and function boundaries. In: Wu, P., Hack, S., editors. Proceedings of the 26th International Conference on Compiler Construction, Austin, TX, USA, February 5-6, 2017. ACM; 2017, p. 131–141.
- A compiler-level intermediate representation based binary analysis and rewriting system. In: Hanzálek, Z., Härtig, H., Castro, M., Kaashoek, M.F., editors. Eighth Eurosys Conference 2013, EuroSys ’13, Prague, Czech Republic, April 14-17, 2013. ACM; 2013, p. 295–308.
- Sok: Demystifying binary lifters through the lens of downstream applications. In: 43rd IEEE Symposium on Security and Privacy, SP 2022, San Francisco, CA, USA, May 22-26, 2022. IEEE; 2022, p. 1100–1119.
- Mcsema. 2021. URL: https://github.com/lifting-bits/mcsema.
- Retdec: An open-source machine-code decompiler. In: July 2018. 2017,.
- Capstone. 2023. URL: https://www.capstone-engine.org/.
- Lasagne: a static binary translator for weak memory model architectures. In: Jhala, R., Dillig, I., editors. PLDI ’22: 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, San Diego, CA, USA, June 13 - 17, 2022. ACM; 2022, p. 888–902.
- Raising binaries to LLVM IR with MCTOLL (WIP paper). In: Chen, J., Shrivastava, A., editors. Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems, LCTES 2019, Phoenix, AZ, USA, June 23-23, 2019. ACM; 2019, p. 213–218.
- Openssl. 2023. URL: https://github.com/openssl/openssl.
- Itanium-demangle. 2016. URL: https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling.
- LLVM: A compilation framework for lifelong program analysis & transformation. In: 2nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2004), 20-24 March 2004, San Jose, CA, USA. IEEE Computer Society; 2004, p. 75–88.
- Spec cpu® 2017. 2022. URL: https://www.spec.org/cpu2017/.
- Eembc benchmarks. 2023. URL: https://www.eembc.org/.
- A survey of symbolic execution techniques. ACM Comput Surv 2018;51(3):50:1–50:39.
- Pyelftools. 2023. URL: https://github.com/eliben/pyelftools.
- Box64. 2023. URL: https://github.com/ptitSeb/box64.
- Command-line utilities written in rust. 2022. URL: https://gist.github.com/sts10/daadbc2f403bdffad1b6d33aff016c0a.
- A tough call: Mitigating advanced code-reuse attacks at the binary level. In: 2016 IEEE Symposium on Security and Privacy (SP). 2016, p. 934–953.
- Cfi: Type-assisted control flow integrity for x86-64 binaries. In: Research in Attacks, Intrusions, and Defenses - 21st International Symposium, RAID 2018, Proceedings. Springer Verlag; 2018, p. 423–444.
- When function signature recovery meets compiler optimization. In: 42nd IEEE Symposium on Security and Privacy, SP 2021, San Francisco, CA, USA, 24-27 May 2021. IEEE; 2021, p. 36–52.
- Multithreaded optimizing technique for dynamic binary translator crossbit. In: International Conference on Computer Science and Software Engineering, CSSE 2008, Volume 5: E-learning and Knowledge Management / Socially Informed and Instructinal Design / Learning Systems Platforms and Architectures / Modeling and Representation / Other Applications , December 12-14, 2008, Wuhan, China. IEEE Computer Society; 2008, p. 945–952.
- HQEMU: a multi-threaded and retargetable dynamic binary translator on multicores. In: Eidt, C., Holler, A.M., Srinivasan, U., Amarasinghe, S.P., editors. 10th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2012, San Jose, CA, USA, March 31 - April 04, 2012. ACM; 2012, p. 104–113.
- Hybrid-dbt: Hardware/software dynamic binary translation targeting VLIW. IEEE Trans Comput Aided Des Integr Circuits Syst 2019;38(10):1872–1885.
- Binrec: dynamic binary lifting and recompilation. In: Bilas, A., Magoutis, K., Markatos, E.P., Kostic, D., Seltzer, M.I., editors. EuroSys ’20: Fifteenth EuroSys Conference 2020, Heraklion, Greece, April 27-30, 2020. ACM; 2020, p. 36:1–36:16.
- Hex-Rays, S.. Ida pro: a cross-platform multi-processor disassembler and debugger. 2014.
- Ghidra. 2023. URL: https://ghidra-sre.org/.
- Bellard, F.. Qemu, a fast and portable dynamic translator. In: Proceedings of the FREENIX Track: 2005 USENIX Annual Technical Conference, April 10-15, 2005, Anaheim, CA, USA. USENIX; 2005, p. 41–46.
- LLBT: an llvm-based static binary translator. In: Jerraya, A., Carloni, L.P., III, V.J.M., Rabbah, R.M., editors. Proceedings of the 15th International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES 2012, part of the Eighth Embedded Systems Week, ESWeek 2012, Tampere, Finland, October 7-12, 2012. ACM; 2012, p. 51–60.
- Native x86 decompilation using semantics-preserving structural analysis and iterative control-flow structuring. In: King, S.T., editor. Proceedings of the 22th USENIX Security Symposium, Washington, DC, USA, August 14-16, 2013. USENIX Association; 2013, p. 353–368.
- Sound C code decompilation for a subset of x86-64 binaries. In: de Boer, F.S., Cerone, A., editors. Software Engineering and Formal Methods - 18th International Conference, SEFM 2020, Amsterdam, The Netherlands, September 14-18, 2020, Proceedings; vol. 12310 of Lecture Notes in Computer Science. Springer; 2020, p. 247–264.
- Smartdec: Approaching C++ decompilation. In: Pinzger, M., Poshyvanyk, D., Buckley, J., editors. WCRE 2011, Limerick, Ireland, October 17-20, 2011. IEEE Computer Society; 2011, p. 347–356.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.