Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Evaluation of OpenAI Codex for HPC Parallel Programming Models Kernel Generation (2306.15121v1)

Published 27 Jun 2023 in cs.AI, cs.ET, and cs.PL

Abstract: We evaluate AI-assisted generative capabilities on fundamental numerical kernels in high-performance computing (HPC), including AXPY, GEMV, GEMM, SpMV, Jacobi Stencil, and CG. We test the generated kernel codes for a variety of language-supported programming models, including (1) C++ (e.g., OpenMP [including offload], OpenACC, Kokkos, SyCL, CUDA, and HIP), (2) Fortran (e.g., OpenMP [including offload] and OpenACC), (3) Python (e.g., numba, Numba, cuPy, and pyCUDA), and (4) Julia (e.g., Threads, CUDA.jl, AMDGPU.jl, and KernelAbstractions.jl). We use the GitHub Copilot capabilities powered by OpenAI Codex available in Visual Studio Code as of April 2023 to generate a vast amount of implementations given simple <kernel> + <programming model> + <optional hints> prompt variants. To quantify and compare the results, we propose a proficiency metric around the initial 10 suggestions given for each prompt. Results suggest that the OpenAI Codex outputs for C++ correlate with the adoption and maturity of programming models. For example, OpenMP and CUDA score really high, whereas HIP is still lacking. We found that prompts from either a targeted language such as Fortran or the more general-purpose Python can benefit from adding code keywords, while Julia prompts perform acceptably well for its mature programming models (e.g., Threads and CUDA.jl). We expect for these benchmarks to provide a point of reference for each programming model's community. Overall, understanding the convergence of LLMs, AI, and HPC is crucial due to its rapidly evolving nature and how it is redefining human-computer interactions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. AMD. 2022. AMD ROCm v5.2 Release. https://rocmdocs.amd.com/en/latest/Current_Release_Notes/Current-Release-Notes.html#amd-rocm-v5-2-release
  2. J. W. Backus and W. P. Heising. 1964. Fortran. IEEE Transactions on Electronic Computers EC-13, 4 (1964), 382–385. https://doi.org/10.1109/PGEC.1964.263818
  3. Effective Extensible Programming: Unleashing Julia on GPUs. IEEE Transactions on Parallel and Distributed Systems (2018). https://doi.org/10.1109/TPDS.2018.2872064 arXiv:1712.03112 [cs.PL]
  4. Julia: A Fresh Approach to Numerical Computing. SIAM Rev. 59, 1 (Jan. 2017), 65–98. https://doi.org/10.1137/141000671 arXiv:http://dx.doi.org/10.1137/141000671
  5. Robert W. Brennan and Jonathan Lesage. 2023. Exploring the Implications of OpenAI Codex on Education for Industry 4.0. In Service Oriented, Holonic and Multi-Agent Manufacturing Systems for Industry of the Future, Theodor Borangiu, Damien Trentesaux, and Paulo Leitão (Eds.). Springer International Publishing, Cham, 254–266.
  6. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 1877–1901. https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
  7. Kokkos: Enabling manycore performance portability through polymorphic memory access patterns. J. Parallel and Distrib. Comput. 74, 12 (2014), 3202–3216. https://doi.org/10.1016/j.jpdc.2014.07.003 Domain-Specific Languages and High-Level Frameworks for High-Performance Computing.
  8. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021).
  9. JuliaGPU/KernelAbstractions.jl: v0.8.3. https://doi.org/10.5281/zenodo.6742177
  10. Code Generation Using Machine Learning: A Systematic Review. IEEE Access 10 (2022), 82434–82455. https://doi.org/10.1109/ACCESS.2022.3196347
  11. Conversing with Copilot: Exploring Prompt Engineering for Solving CS1 Problems Using Natural Language. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (Toronto ON, Canada) (SIGCSE 2023). Association for Computing Machinery, New York, NY, USA, 1136–1142. https://doi.org/10.1145/3545945.3569823
  12. DARPA’s HPCS Program: History, Models, Tools, Languages. In Advances in COMPUTERS. Advances in Computers, Vol. 72. Elsevier, 1–100. https://doi.org/10.1016/S0065-2458(08)00001-6
  13. The International Exascale Software Project roadmap. The International Journal of High Performance Computing Applications 25, 1 (2011), 3–60. https://doi.org/10.1177/1094342010391989 arXiv:https://doi.org/10.1177/1094342010391989
  14. One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 4 (2006), 594–611. https://doi.org/10.1109/TPAMI.2006.79
  15. Michael Fink. 2004. Object Classification from a Single Example Utilizing Class Relevance Metrics. In Advances in Neural Information Processing Systems, L. Saul, Y. Weiss, and L. Bottou (Eds.), Vol. 17. MIT Press. https://proceedings.neurips.cc/paper_files/paper/2004/file/ef1e491a766ce3127556063d49bc2f98-Paper.pdf
  16. The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming. In Proceedings of the 24th Australasian Computing Education Conference (Virtual Event, Australia) (ACE ’22). Association for Computing Machinery, New York, NY, USA, 10–19. https://doi.org/10.1145/3511861.3511863
  17. Luciano Floridi and Massimo Chiriatti. 2020. GPT-3: Its nature, scope, limits, and consequences. Minds and Machines 30 (2020), 681–694.
  18. Evaluating performance and portability of high-level programming models: Julia, Python/Numba, and Kokkos on exascale nodes. arXiv:2303.06195 [cs.DC]
  19. Thomas Helmuth and Peter Kelly. 2021. PSB2: The Second Program Synthesis Benchmark Suite. In Proceedings of the Genetic and Evolutionary Computation Conference (Lille, France) (GECCO ’21). Association for Computing Machinery, New York, NY, USA, 785–794. https://doi.org/10.1145/3449639.3459285
  20. Julia Hirschberg and Christopher D. Manning. 2015. Advances in natural language processing. Science 349, 6245 (2015), 261–266. https://doi.org/10.1126/science.aaa8685 arXiv:https://www.science.org/doi/pdf/10.1126/science.aaa8685
  21. Saki Imai. 2022. Is GitHub Copilot a Substitute for Human Pair-Programming? An Empirical Study. In Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings (Pittsburgh, Pennsylvania) (ICSE ’22). Association for Computing Machinery, New York, NY, USA, 319–321. https://doi.org/10.1145/3510454.3522684
  22. Zheming Jin. 2021. The Rodinia Benchmarks in SYCL. Technical Report. Oak Ridge National Lab.(ORNL), Oak Ridge, TN (United States).
  23. Zhemin Jin. 2023. Hecbench. https://github.com/zjin-lcf/HeCBench.
  24. PyExaFMM: an exercise in designing high-performance software with Python and Numba. arXiv:2303.08394 [cs.SE]
  25. Andreas Klöckner. [n. d.]. pycuda 2022.2.2 documentation. https://documen.tician.de/pycuda/. Accessed: 2023-04-20.
  26. PyCUDA and PyOpenCL: A scripting-based approach to GPU run-time code generation. Parallel Comput. 38, 3 (2012), 157–174. https://doi.org/10.1016/j.parco.2011.09.001
  27. Tobias Knopp. 2014. Experimental multi-threading support for the Julia programming language. In 2014 First Workshop for High Performance Technical Computing in Dynamic Languages. IEEE, 1–5.
  28. Exascale Computing in the United States. Computing in Science & Engineering 21, 1 (2019), 17–29. https://doi.org/10.1109/MCSE.2018.2875366
  29. Numba: A LLVM-based Python JIT compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC. 1–6.
  30. Analysis of the popularity of programming languages in open source software communities. In 2020 International Conference on Big Data and Social Sciences (ICBDSS). 111–114. https://doi.org/10.1109/ICBDSS51270.2020.00033
  31. Nhan Nguyen and Sarah Nadi. 2022. An Empirical Evaluation of GitHub Copilot’s Code Suggestions. In Proceedings of the 19th International Conference on Mining Software Repositories (Pittsburgh, Pennsylvania) (MSR ’22). Association for Computing Machinery, New York, NY, USA, 1–5. https://doi.org/10.1145/3524842.3528470
  32. ROYUD Nishino and Shohei Hido Crissman Loomis. 2017. Cupy: A numpy-compatible library for nvidia gpu calculations. 31st confernce on neural information processing systems 151, 7 (2017).
  33. NVIDIA. 2022a. CUDA Toolkit Documentation - v11.7.0. https://developer.nvidia.com/cuda-toolkit
  34. NVIDIA. 2022b. The API reference guide for Thrust, the CUDA C++ template library. https://docs.nvidia.com/cuda/thrust/index.html
  35. OpenACC Architecture Review Board. 2020. OpenACC Application Program Interface Version 3.1. https://www.openacc.org/sites/default/files/inline-images/Specification/OpenACC-3.1-final.pdf
  36. OpenMP Architecture Review Board. 2021. OpenMP Application Program Interface Version 5.2. https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-5-2.pdf
  37. Asleep at the Keyboard? Assessing the Security of GitHub Copilot’s Code Contributions. In 2022 IEEE Symposium on Security and Privacy (SP). 754–768. https://doi.org/10.1109/SP46214.2022.9833571
  38. Inc. Preferred Networks and Inc. Preferred Infrastructure. [n. d.]. CuPy – NumPy & SciPy for GPU. https://docs.cupy.dev/en/stable/. Accessed: 2023-04-20.
  39. JuliaGPU/AMDGPU.jl: v0.4.1. https://doi.org/10.5281/zenodo.6949520
  40. Automatic Generation of Programming Exercises and Code Explanations Using Large Language Models (ICER ’22). Association for Computing Machinery, New York, NY, USA, 27–43. https://doi.org/10.1145/3501385.3543957
  41. Choose Your Programming Copilot: A Comparison of the Program Synthesis Performance of Github Copilot and Genetic Programming. In Proceedings of the Genetic and Evolutionary Computation Conference (Boston, Massachusetts) (GECCO ’22). Association for Computing Machinery, New York, NY, USA, 1019–1027. https://doi.org/10.1145/3512290.3528700
  42. Bjarne Stroustrup. 2013. The C++ programming language. Pearson Education.
  43. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models. In Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI EA ’22). Association for Computing Machinery, New York, NY, USA, Article 332, 7 pages. https://doi.org/10.1145/3491101.3519665
  44. The NumPy array: a structure for efficient numerical computation. Computing in science & engineering 13, 2 (2011), 22–30.
  45. Guido Van Rossum et al. 2007. Python Programming Language.. In USENIX annual technical conference, Vol. 41. Santa Clara, CA, 1–36.
  46. Extreme Heterogeneity 2018 - Productive Computational Science in the Era of Extreme Heterogeneity: Report for DOE ASCR Workshop on Extreme Heterogeneity. Technical Report. USDOE Office of Science (SC) (United States). https://doi.org/10.2172/1473756
  47. Generalizing from a Few Examples: A Survey on Few-Shot Learning. ACM Comput. Surv. 53, 3, Article 63 (jun 2020), 34 pages. https://doi.org/10.1145/3386252
  48. Michel Wermelinger. 2023. Using GitHub Copilot to Solve Simple Programming Problems. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (Toronto ON, Canada) (SIGCSE 2023). Association for Computing Machinery, New York, NY, USA, 172–178. https://doi.org/10.1145/3545945.3569830
  49. Assessing the Quality of GitHub Copilot’s Code Generation. In Proceedings of the 18th International Conference on Predictive Models and Data Analytics in Software Engineering (Singapore, Singapore) (PROMISE 2022). Association for Computing Machinery, New York, NY, USA, 62–71. https://doi.org/10.1145/3558489.3559072
Citations (19)

Summary

We haven't generated a summary for this paper yet.