Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

OMPGPT: A Generative Pre-trained Transformer Model for OpenMP (2401.16445v3)

Published 28 Jan 2024 in cs.SE, cs.DC, and cs.LG

Abstract: LLMssuch as ChatGPT have significantly advanced the field of NLP. This trend led to the development of code-based LLMs such as StarCoder, WizardCoder, and CodeLlama, which are trained extensively on vast repositories of code and programming languages. While the generic abilities of these code LLMs are useful for many programmers in tasks like code generation, the area of high-performance computing (HPC) has a narrower set of requirements that make a smaller and more domain-specific model a smarter choice. This paper presents OMPGPT, a novel domain-specific model meticulously designed to harness the inherent strengths of LLMs for OpenMP pragma generation. Furthermore, we leverage prompt engineering techniques from the NLP domain to create Chain-of-OMP, an innovative strategy designed to enhance OMPGPT's effectiveness. Our extensive evaluations demonstrate that OMPGPT outperforms existing LLMs specialized in OpenMP tasks and maintains a notably smaller size, aligning it more closely with the typical hardware constraints of HPC environments. We consider our contribution as a pivotal bridge, connecting the advantage of LLMs with the specific demands of HPC tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. OpenMP Offload Features and Strategies for High Performance across Architectures and Compilers. In 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 564–573.
  2. GPT-NeoX-20B: An Open-Source Autoregressive Language Model. arXiv:2204.06745 [cs.CL]
  3. GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 If you use this software, please cite it using these metadata..
  4. CompCodeVet: A Compiler-guided Validation and Enhancement Approach for Code Dataset. https://doi.org/10.48550/arXiv.2311.06505 arXiv:2311.06505 [cs]
  5. Lm4hpc: Towards effective language model application in high-performance computing. In International Workshop on OpenMP. Springer, 18–33.
  6. Learning to Parallelize with OpenMP by Augmented Heterogeneous AST Representation. Proceedings of Machine Learning and Systems 5 (2023).
  7. Evaluating Large Language Models Trained on Code. arXiv:2107.03374 [cs.LG]
  8. L. Dagum and R. Menon. 1998. OpenMP: an industry standard API for shared-memory programming. IEEE Computational Science and Engineering 5, 1 (1998), 46–55. https://doi.org/10.1109/99.660313
  9. HPC-GPT: Integrating Large Language Model for High-Performance Computing. In Proceedings of the SC ’23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis (Denver CO USA, 2023-11-12). ACM, 951–960. https://doi.org/10.1145/3624062.3624172
  10. Performance Optimization using Multimodal Modeling and Heterogeneous GNN. In Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing. 45–57.
  11. Power Constrained Autotuning using Graph Neural Networks. arXiv preprint arXiv:2302.11467 (2023).
  12. Sidong Feng and Chunyang Chen. [n. d.]. Prompting Is All You Need: Automated Android Bug Replay with Large Language Models. arXiv. https://doi.org/10.48550/arXiv.2306.01987 arXiv:2306.01987 [cs]
  13. The Pile: An 800GB Dataset of Diverse Text for Language Modeling. arXiv preprint arXiv:2101.00027 (2020).
  14. Quantifying openmp: Statistical insights into usage and adoption. In 2023 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 1–7.
  15. Domain-Specific Code Language Models: Unraveling the Potential for HPC Codes and Tasks. arXiv preprint arXiv:2312.13322 (2023).
  16. Advising OpenMP Parallelization via a Graph-Based Approach with Transformers. arXiv preprint arXiv:2305.11999 (2023).
  17. The Stack: 3 TB of permissively licensed source code. Preprint (2022).
  18. Structured Chain-of-Thought Prompting for Code Generation. arXiv:2305.06599 [cs] http://arxiv.org/abs/2305.06599
  19. StarCoder: may the source be with you! arXiv:2305.06161 [cs.CL]
  20. Improving ChatGPT Prompt for Code Generation. https://doi.org/10.48550/arXiv.2305.08360 arXiv:2305.08360 [cs]
  21. WizardCoder: Empowering Code Large Language Models with Evol-Instruct. arXiv:2306.08568 [cs.CL]
  22. Modeling Parallel Programs using Large Language Models. arXiv:2306.17281 [cs.DC]
  23. Code Llama: Open Foundation Models for Code. arXiv:2308.12950 [cs.CL]
  24. Achieving High-Level Software Component Summarization via Hierarchical Chain-of-Thought Prompting and Static Code Analysis. In 2023 IEEE International Conference on Data and Software Engineering (ICoDSE). IEEE, 7–12.
  25. Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv:2307.09288 [cs.CL]
  26. Comparing Llama-2 and GPT-3 LLMs for HPC kernels generation. arXiv:2309.07103 [cs.SE]
  27. Attention is all you need. Advances in neural information processing systems 30 (2017).
  28. Ben Wang and Aran Komatsuzaki. 2021. GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax.
  29. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv:2201.11903 [cs.CL]
  30. A systematic evaluation of large language models of code. In Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming (San Diego, CA, USA) (MAPS 2022). Association for Computing Machinery, New York, NY, USA, 1–10. https://doi.org/10.1145/3520312.3534862
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Arijit Bhattacharjee (4 papers)
  2. Nesreen Ahmed (18 papers)
  3. Niranjan Hasabnis (21 papers)
  4. Gal Oren (38 papers)
  5. Vy Vo (12 papers)
  6. Ali Jannesari (56 papers)
  7. le Chen (71 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.