Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

mlirSynth: Automatic, Retargetable Program Raising in Multi-Level IR using Program Synthesis (2310.04196v1)

Published 6 Oct 2023 in cs.PL, cs.CL, cs.DC, and cs.PF

Abstract: MLIR is an emerging compiler infrastructure for modern hardware, but existing programs cannot take advantage of MLIR's high-performance compilation if they are described in lower-level general purpose languages. Consequently, to avoid programs needing to be rewritten manually, this has led to efforts to automatically raise lower-level to higher-level dialects in MLIR. However, current methods rely on manually-defined raising rules, which limit their applicability and make them challenging to maintain as MLIR dialects evolve. We present mlirSynth -- a novel approach which translates programs from lower-level MLIR dialects to high-level ones without manually defined rules. Instead, it uses available dialect definitions to construct a program space and searches it effectively using type constraints and equivalences. We demonstrate its effectiveness \revi{by raising C programs} to two distinct high-level MLIR dialects, which enables us to use existing high-level dialect specific compilation flows. On Polybench, we show a greater coverage than previous approaches, resulting in geomean speedups of 2.5x (Intel) and 3.4x (AMD) over state-of-the-art compilation flows for the C programming language. mlirSynth also enables retargetability to domain-specific accelerators, resulting in a geomean speedup of 21.6x on a TPU.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. Metalift. https://metalift.pages.dev/. Accessed: 2023-04-13.
  2. Martín Abadi. Tensorflow: learning functions at scale. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, pages 1–1, 2016.
  3. Maaz Bin Safeer Ahmad and Alvin Cheung. Leveraging parallel data processing frameworks with verified lifting. arXiv preprint arXiv:1611.07623, 2016.
  4. Maaz Bin Safeer Ahmad and Alvin Cheung. Optimizing data-intensive applications automatically by leveraging parallel data processing frameworks. In Proceedings of the 2017 ACM International Conference on Management of Data, pages 1675–1678, 2017.
  5. Maaz Bin Safeer Ahmad and Alvin Cheung. Automatically leveraging mapreduce frameworks for data-intensive applications. In Proceedings of the 2018 International Conference on Management of Data, pages 1205–1220, 2018.
  6. Automatically translating image processing libraries to halide. ACM Transactions on Graphics (TOG), 38(6):1–13, 2019.
  7. Recursive program synthesis. In Computer Aided Verification - 25th International Conference, CAV 2013, Saint Petersburg, Russia, July 13-19, 2013. Proceedings, pages 934–950, 2013.
  8. Syntax-guided synthesis. IEEE, 2013.
  9. Accelerating legacy string kernels via bounded automata learning. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 235–249, 2020.
  10. Tiramisu: A polyhedral compiler for expressing fast and portable code. In 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pages 193–205. IEEE, 2019.
  11. Deepcoder: Learning to write programs. In International Conference on Learning Representations (ICLR 2017). OpenReview. net, 2017.
  12. Just-in-time learning for bottom-up enumerative synthesis. Proceedings of the ACM on Programming Languages, 4(OOPSLA):1–29, 2020.
  13. Autopandas: neural-backed generators for program synthesis. Proceedings of the ACM on Programming Languages, 3(OOPSLA):1–27, 2019.
  14. Pluto: A practical and fully automatic polyhedral program optimization system. In Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation (PLDI 08), Tucson, AZ (June 2008). Citeseer, 2008.
  15. Progressive raising in multi-level IR. In CGO, pages 15–26. IEEE, 2021.
  16. Tvm: an automated end-to-end optimizing compiler for deep learning. In Proceedings of the 13th USENIX conference on Operating Systems Design and Implementation, pages 579–594, 2018.
  17. Behavioral consistency of C and verilog programs using bounded model checking. In DAC, pages 368–371. ACM, 2003.
  18. Type-directed program synthesis and constraint generation for library portability. In 2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT), pages 55–67. IEEE, 2019.
  19. M3: Semantic api migrations. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, pages 90–102, 2020.
  20. Program lifting using gray-box behavior. In 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT), pages 60–74. IEEE, 2021.
  21. Modeling black-box components with probabilistic synthesis. In Proceedings of the 19th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences, pages 1–14, 2020.
  22. Scalable validation of binary lifters. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 655–671, 2020.
  23. Kernelfarer: replacing native-code idioms with high-performance library calls. ACM Transactions On Architecture And Code Optimization (TACO), 18(3):1–22, 2021.
  24. Polly’s polyhedral scheduling in the presence of reductions. arXiv preprint arXiv:1505.07716, 2015.
  25. Gradual synthesis for static parallelization of single-pass array-processing programs. ACM SIGPLAN Notices, 52(6):572–585, 2017.
  26. Irdl: an ir definition language for ssa compilers. pages 199–212, 06 2022.
  27. Fuzzing tools for mlir. https://github.com/opencompl/mlir-fuzz, 2022. Accessed: 2022-10-22.
  28. Automatic matching of legacy code to heterogeneous apis: An idiomatic approach. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, pages 139–153, 2018.
  29. Compilation for a high-performance systolic array. ACM SIGPLAN Notices, 21(7):27–38, 1986.
  30. Polly—performing polyhedral optimizations on a low-level intermediate representation. Parallel Processing Letters, 22(04):1250010, 2012.
  31. Polly-polyhedral optimization in llvm. In Proceedings of the First International Workshop on Polyhedral Compilation Techniques (IMPACT), volume 2011, page 1, 2011.
  32. Sumit Gulwani. Automating string processing in spreadsheets using input-output examples. ACM Sigplan Notices, 46(1):317–330, 2011.
  33. Niranjan Hasabnis and R Sekar. Lifting assembly to intermediate representation: A novel approach leveraging compilers. In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, pages 311–324, 2016.
  34. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th annual international symposium on computer architecture, pages 1–12, 2017.
  35. Verified lifting of stencil computations. ACM SIGPLAN Notices, 51(6):711–726, 2016.
  36. CBMC - C bounded model checker - (competition contribution). In TACAS, volume 8413 of Lecture Notes in Computer Science, pages 389–391. Springer, 2014.
  37. C. Lattner and V. Adve. Llvm: a compilation framework for lifelong program analysis & transformation. In International Symposium on Code Generation and Optimization, 2004. CGO 2004., pages 75–86, 2004.
  38. Mlir: Scaling compiler infrastructure for domain specific computation. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pages 2–14. IEEE, 2021.
  39. Nvidia tensor core programmability, performance & precision. In 2018 IEEE international parallel and distributed processing symposium workshops (IPDPSW), pages 522–531. IEEE, 2018.
  40. Matching linear algebra and tensor code to specialized hardware accelerators. In Proceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction, pages 85–97, 2023.
  41. Helium: Lifting high-performance stencil kernels from stripped x86 binaries to halide dsl code. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 391–402, 2015.
  42. Polygeist: Raising c to polyhedral mlir. In 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT), pages 45–59. IEEE, 2021.
  43. Predictive synthesis of api-centric code. In Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming, pages 40–49, 2022.
  44. Learning to infer program sketches. In International Conference on Machine Learning, pages 4861–4870. PMLR, 2019.
  45. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
  46. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. Acm Sigplan Notices, 48(6):519–530, 2013.
  47. Survey of machine learning accelerators. In 2020 IEEE high performance extreme computing conference (HPEC), pages 1–12. IEEE, 2020.
  48. Amit Sabne. Xla : Compiling machine learning for peak performance, 2020.
  49. Tf-coder: Program synthesis for tensor manipulations. ACM Transactions on Programming Languages and Systems (TOPLAS), 44(2):1–36, 2022.
  50. Modular synthesis of sketches using models. In Verification, Model Checking, and Abstract Interpretation: 15th International Conference, VMCAI 2014, San Diego, CA, USA, January 19-21, 2014, Proceedings 15, pages 395–414. Springer, 2014.
  51. Armando Solar-Lezama. The sketching approach to program synthesis. In Programming Languages and Systems: 7th Asian Symposium, APLAS 2009, Seoul, Korea, December 14-16, 2009. Proceedings 7, pages 4–13. Springer, 2009.
  52. Llvm compiler implementation for explicit parallelization and simd vectorization. In Proceedings of the Fourth Workshop on the LLVM Compiler Infrastructure in HPC, pages 1–11, 2017.
  53. Growing solver-aided languages with rosette. In Proceedings of the 2013 ACM international symposium on New ideas, new paradigms, and reflections on programming & software, pages 135–152, 2013.
  54. Enabling one-size-fits-all compilation optimization for inference across machine learning computers. IEEE Transactions on Computers, 71(9):2313–2326, 2021.
  55. Bind the gap: Compiling real software to hardware fft accelerators. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, pages 687–702, 2022.
  56. Raising binaries to llvm ir with mctoll (wip paper). In Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems, pages 213–218, 2019.
  57. Simplifying dependent reductions in the polyhedral model. Proceedings of the ACM on Programming Languages, 5(POPL):1–33, 2021.
  58. Automatic program synthesis of long programs with a learned garbage collector. Advances in neural information processing systems, 31, 2018.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Alexander Brauckmann (4 papers)
  2. Elizabeth Polgreen (20 papers)
  3. Tobias Grosser (21 papers)
  4. Michael F. P. O'Boyle (14 papers)
Citations (1)