A System-Level Dynamic Binary Translator using Automatically-Learned Translation Rules (2402.09688v1)
Abstract: System-level emulators have been used extensively for system design, debugging and evaluation. They work by providing a system-level virtual machine to support a guest operating system (OS) running on a platform with the same or different native OS that uses the same or different instruction-set architecture. For such system-level emulation, dynamic binary translation (DBT) is one of the core technologies. A recently proposed learning-based DBT approach has shown a significantly improved performance with a higher quality of translated code using automatically learned translation rules. However, it has only been applied to user-level emulation, and not yet to system-level emulation. In this paper, we explore the feasibility of applying this approach to improve system-level emulation, and use QEMU to build a prototype. ... To achieve better performance, we leverage several optimizations that include coordination overhead reduction to reduce the overhead of each coordination, and coordination elimination and code scheduling to reduce the coordination frequency. Experimental results show that it can achieve an average of 1.36X speedup over QEMU 6.1 with negligible coordination overhead in the system emulation mode using SPEC CINT2006 as application benchmarks and 1.15X on real-world applications.
- F. Bellard, “Qemu, a fast and portable dynamic translator,” in Proceedings of the FREENIX Track: 2005 USENIX Annual Technical Conference, April 10-15, 2005, Anaheim, CA, USA. USENIX, 2005, pp. 41–46. [Online]. Available: http://www.usenix.org/events/usenix05/tech/freenix/bellard.html
- J. Jiang, R. Dong, Z. Zhou, C. Song, W. Wang, P. Yew, and W. Zhang, “More with less - deriving more translation rules with less training data for dbts using parameterization,” in 53rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2020, Athens, Greece, October 17-21, 2020. IEEE, 2020, pp. 415–426. [Online]. Available: https://doi.org/10.1109/MICRO50266.2020.00043
- C. Song, W. Wang, P. Yew, A. Zhai, and W. Zhang, “Unleashing the power of learning: An enhanced learning-based approach for dynamic binary translation,” in 2019 USENIX Annual Technical Conference, USENIX ATC 2019, Renton, WA, USA, July 10-12, 2019, D. Malkhi and D. Tsafrir, Eds. USENIX Association, 2019, pp. 77–90. [Online]. Available: https://www.usenix.org/conference/atc19/presentation/song
- W. Wang, S. McCamant, A. Zhai, and P. Yew, “Enhancing cross-isa DBT through automatically learned translation rules,” in Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2018, Williamsburg, VA, USA, March 24-28, 2018, X. Shen, J. Tuck, R. Bianchini, and V. Sarkar, Eds. ACM, 2018, pp. 84–97. [Online]. Available: https://doi.org/10.1145/3173162.3177160
- C. Chang, J. Wu, W. Hsu, P. Liu, and P. Yew, “Efficient memory virtualization for cross-isa system mode emulation,” in 10th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE ’14, Salt Lake City, UT, USA, March 01 - 02, 2014, M. Hirzel, E. Petrank, and D. Tsafrir, Eds. ACM, 2014, pp. 117–128. [Online]. Available: https://doi.org/10.1145/2576195.2576201
- Z. Wang, R. Liu, Y. Chen, X. Wu, H. Chen, W. Zhang, and B. Zang, “COREMU: a scalable and portable parallel full-system emulator,” in Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2011, San Antonio, TX, USA, February 12-16, 2011, C. Cascaval and P. Yew, Eds. ACM, 2011, pp. 213–222. [Online]. Available: https://doi.org/10.1145/1941553.1941583
- A. D’Antras, C. Gorgovan, J. D. Garside, J. Goodacre, and M. Luján, “Hypermambo-x64: Using virtualization to support high-performance transparent binary translation,” in Proceedings of the 13th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE 2017, Xi’an, China, April 8-9, 2017. ACM, 2017, pp. 228–241. [Online]. Available: https://doi.org/10.1145/3050748.3050756
- A. D’Antras, C. Gorgovan, J. D. Garside, and M. Luján, “Low overhead dynamic binary translation on ARM,” in Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017, A. Cohen and M. T. Vechev, Eds. ACM, 2017, pp. 333–346. [Online]. Available: https://doi.org/10.1145/3062341.3062371
- C. Wang, S. Hu, H. Kim, S. R. Nair, M. B. Jr., Z. Ying, and Y. Wu, “Stardbt: An efficient multi-platform dynamic binary translation system,” in Advances in Computer Systems Architecture, 12th Asia-Pacific Conference, ACSAC 2007, Seoul, Korea, August 23-25, 2007, Proceedings, ser. Lecture Notes in Computer Science, L. Choi, Y. Paek, and S. Cho, Eds., vol. 4697. Springer, 2007, pp. 4–15. [Online]. Available: https://doi.org/10.1007/978-3-540-74309-5_3
- D. Hong, C. Hsu, P. Yew, J. Wu, W. Hsu, P. Liu, C. Wang, and Y. Chung, “HQEMU: a multi-threaded and retargetable dynamic binary translator on multicores,” in 10th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2012, San Jose, CA, USA, March 31 - April 04, 2012, C. Eidt, A. M. Holler, U. Srinivasan, and S. P. Amarasinghe, Eds. ACM, 2012, pp. 104–113. [Online]. Available: https://doi.org/10.1145/2259016.2259030
- K. Huang, F. Zhang, C. Li, G. Niu, J. Wu, and T. Liu, “BTMMU: an efficient and versatile cross-isa memory virtualization,” in VEE ’21: 17th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, Virtual USA, April 16, 2021, B. L. Titzer, H. Xu, and I. Zhang, Eds. ACM, 2021, pp. 71–83. [Online]. Available: https://doi.org/10.1145/3453933.3454015
- J. Chen, D. Li, Z. Mi, Y. Liu, B. Zang, H. Guan, and H. Chen, “Duvisor: a user-level hypervisor through delegated virtualization,” CoRR, vol. abs/2201.09652, 2022. [Online]. Available: https://arxiv.org/abs/2201.09652
- F. Salgado, T. Gomes, S. Pinto, J. Cabral, and A. Tavares, “Condition codes evaluation on dynamic binary translation for embedded platforms,” IEEE Embed. Syst. Lett., vol. 9, no. 3, pp. 89–92, 2017. [Online]. Available: https://doi.org/10.1109/LES.2017.2718531
- X. Liu, R. Zhao, J. Pang, M. Yin, L. Bai, and W. Chen, “A flag simulation strategy based on fusion of semantic trees in binary translation,” in 10th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2013, Shenyang, China, July 23-25, 2013, J. Chen, X. Wang, L. Wang, J. Sun, and X. Meng, Eds. IEEE, 2013, pp. 1070–1074. [Online]. Available: https://doi.org/10.1109/FSKD.2013.6816355
- C. Chu, Y. Zheng, H. Guan, and A. Liang, “A two-phase optimization approach for condition codes in a machine adaptable dynamic binary translator,” in CSIE 2009, 2009 WRI World Congress on Computer Science and Information Engineering, March 31 - April 2, 2009, Los Angeles, California, USA, 7 Volumes, M. Burgin, M. H. Chowdhury, C. H. Ham, S. A. Ludwig, W. Su, and S. Yenduri, Eds. IEEE Computer Society, 2009, pp. 29–32. [Online]. Available: https://doi.org/10.1109/CSIE.2009.275
- W. Wang, “Helper function inlining in dynamic binary translation,” in CC ’21: 30th ACM SIGPLAN International Conference on Compiler Construction, Virtual Event, Republic of Korea, March 2-3, 2021, A. Smith, D. Demange, and R. Gupta, Eds. ACM, 2021, pp. 107–118. [Online]. Available: https://doi.org/10.1145/3446804.3446851
- T. Spink, H. Wagstaff, and B. Franke, “A retargetable system-level DBT hypervisor,” in 2019 USENIX Annual Technical Conference, USENIX ATC 2019, Renton, WA, USA, July 10-12, 2019, D. Malkhi and D. Tsafrir, Eds. USENIX Association, 2019, pp. 505–520. [Online]. Available: https://www.usenix.org/conference/atc19/presentation/spink
- V. J. Reddi, D. Connors, R. Cohn, and M. D. Smith, “Persistent code caching: Exploiting code reuse across executions and applications,” in Fifth International Symposium on Code Generation and Optimization (CGO 2007), 11-14 March 2007, San Jose, California, USA. IEEE Computer Society, 2007, pp. 74–88. [Online]. Available: https://doi.org/10.1109/CGO.2007.29
- W. Wang, P. Yew, A. Zhai, and S. McCamant, “A general persistent code caching framework for dynamic binary translation (DBT),” in 2016 USENIX Annual Technical Conference, USENIX ATC 2016, Denver, CO, USA, June 22-24, 2016, A. Gulati and H. Weatherspoon, Eds. USENIX Association, 2016, pp. 591–603. [Online]. Available: https://www.usenix.org/conference/atc16/technical-sessions/presentation/wang
- A. D’Antras, C. Gorgovan, J. D. Garside, and M. Luján, “Optimizing indirect branches in dynamic binary translators,” ACM Trans. Archit. Code Optim., vol. 13, no. 1, pp. 7:1–7:25, 2016. [Online]. Available: https://doi.org/10.1145/2866573
- N. Jia, C. Yang, Y. He, and X. Cheng, “DTT: program structure-aware indirect branch optimization via direct-tpc-table in DBT system,” in Computing Frontiers Conference, CF’14, Cagliari, Italy - May 20 - 22, 2014, P. Trancoso, D. Franklin, and S. A. McKee, Eds. ACM, 2014, pp. 12:1–12:10. [Online]. Available: https://doi.org/10.1145/2597917.2597944
- N. Jia, C. Yang, J. Wang, D. Tong, and K. Wang, “SPIRE: improving dynamic binary translation through spc-indexed indirect branch redirecting,” in ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (co-located with ASPLOS 2013), VEE ’13, Houston, TX, USA, March 16-17, 2013, S. Muir, G. Heiser, and S. M. Blackburn, Eds. ACM, 2013, pp. 1–12. [Online]. Available: https://doi.org/10.1145/2451512.2451516
- X. Zhang, X. Gao, Q. Guo, J. Huang, H. Liu, and X. Meng, “VBIW: optimizing indirect branch in dynamic binary translation,” in 10th IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, HPCC/EUC 2013, Zhangjiajie, China, November 13-15, 2013. IEEE, 2013, pp. 1456–1462. [Online]. Available: https://doi.org/10.1109/HPCC.and.EUC.2013.206
- S. Fu, J. Wu, and W. Hsu, “Improving SIMD code generation in QEMU,” in Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, DATE 2015, Grenoble, France, March 9-13, 2015, W. Nebel and D. Atienza, Eds. ACM, 2015, pp. 1233–1236. [Online]. Available: http://dl.acm.org/citation.cfm?id=2757098
- Y. Liu, D. Hong, J. Wu, S. Fu, and W. Hsu, “Exploiting asymmetric SIMD register configurations in arm-to-x86 dynamic binary translation,” in 26th International Conference on Parallel Architectures and Compilation Techniques, PACT 2017, Portland, OR, USA, September 9-13, 2017. IEEE Computer Society, 2017, pp. 343–355. [Online]. Available: https://doi.org/10.1109/PACT.2017.15
- S. Fu, D. Hong, Y. Liu, J. Wu, and W. Hsu, “Efficient and retargetable SIMD translation in a dynamic binary translator,” Softw. Pract. Exp., vol. 48, no. 6, pp. 1312–1330, 2018. [Online]. Available: https://doi.org/10.1002/spe.2573
- ——, “Optimizing data permutations in structured loads/stores translation and SIMD register mapping for a cross-isa dynamic binary translator,” J. Syst. Archit., vol. 98, pp. 173–190, 2019. [Online]. Available: https://doi.org/10.1016/j.sysarc.2019.07.008
- M. Kristien, T. Spink, B. Campbell, S. Sarkar, I. Stark, B. Franke, I. Böhm, and N. P. Topham, “Fast and correct load-link/store-conditional instruction handling in DBT systems,” IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., vol. 39, no. 11, pp. 3544–3554, 2020. [Online]. Available: https://doi.org/10.1109/TCAD.2020.3013048
- Z. Zhao, Z. Jiang, Y. Chen, X. Gong, W. Wang, and P. Yew, “Enhancing atomic instruction emulation for cross-isa dynamic binary translation,” in IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2021, Seoul, South Korea, February 27 - March 3, 2021, J. W. Lee, M. L. Soffa, and A. Zaks, Eds. IEEE, 2021, pp. 351–362. [Online]. Available: https://doi.org/10.1109/CGO51591.2021.9370312
- R. A. Sokolov and A. V. Ermolovich, “Background optimization in full system binary translation,” Program. Comput. Softw., vol. 38, no. 3, pp. 119–126, 2012. [Online]. Available: https://doi.org/10.1134/S0361768812030073
- E. G. Cota and L. P. Carloni, “Cross-isa machine instrumentation using fast and scalable dynamic binary translation,” in Proceedings of the 15th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE 2019, Providence, RI, USA, April 14, 2019, J. B. Sartor, M. Naik, and C. J. Rossbach, Eds. ACM, 2019, pp. 74–87. [Online]. Available: https://doi.org/10.1145/3313808.3313811
- P. Kedia and S. Bansal, “Fast dynamic binary translation for the kernel,” in ACM SIGOPS 24th Symposium on Operating Systems Principles, SOSP ’13, Farmington, PA, USA, November 3-6, 2013, M. Kaminsky and M. Dahlin, Eds. ACM, 2013, pp. 101–115. [Online]. Available: https://doi.org/10.1145/2517349.2522718
- S. Rokicki, E. Rohou, and S. Derrien, “Hardware-accelerated dynamic binary translation,” in Design, Automation & Test in Europe Conference & Exhibition, DATE 2017, Lausanne, Switzerland, March 27-31, 2017, D. Atienza and G. D. Natale, Eds. IEEE, 2017, pp. 1062–1067. [Online]. Available: https://doi.org/10.23919/DATE.2017.7927147
- T. Spink, H. Wagstaff, and B. Franke, “Hardware-accelerated cross-architecture full-system virtualization,” ACM Trans. Archit. Code Optim., vol. 13, no. 4, pp. 36:1–36:25, 2016. [Online]. Available: https://doi.org/10.1145/2996798