HPC-Coder: Modeling Parallel Programs using Large Language Models (2306.17281v2)
Abstract: Parallel programs in high performance computing (HPC) continue to grow in complexity and scale in the exascale era. The diversity in hardware and parallel programming models make developing, optimizing, and maintaining parallel software even more burdensome for developers. One way to alleviate some of these burdens is with automated development and analysis tools. Such tools can perform complex and/or remedial tasks for developers that increase their productivity and decrease the chance for error. Until recently, such tools for code development and performance analysis have been limited in the complexity of tasks they can perform, especially for parallel programs. However, with recent advancements in LLMing, and the availability of large amounts of open-source code related data, these tools have started to utilize predictive LLMs to automate more complex tasks. In this paper, we show how LLMs can be applied to tasks specific to high performance and scientific codes. We introduce a new dataset of HPC and scientific codes and use it to fine-tune several pre-trained models. We compare several pre-trained LLMs on HPC-related tasks and introduce a new model, HPC-Coder, fine-tuned on parallel codes. In our experiments, we show that this model can auto-complete HPC functions where generic models cannot, decorate for loops with OpenMP pragmas, and model performance changes in scientific application repositories as well as programming competition solutions.
- [n. d.]. HuggingFace. https://huggingface.co/. Accessed: 2022.
- [n. d.]. ML4Code. https://ml4code.github.io/. Accessed: 2022.
- A Transformer-based Approach for Source Code Summarization. ArXiv abs/2005.00653 (2020).
- Toufique Ahmed and Prem Devanbu. 2022. Learning code summarization from a small and local dataset. ArXiv abs/2206.00804 (2022).
- Miltiadis Allamanis. 2019. The Adverse Effects of Code Duplication in Machine Learning Models of Code. In Proceedings of the 2019 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Athens, Greece) (Onward! 2019). Association for Computing Machinery, New York, NY, USA, 143–153. https://doi.org/10.1145/3359591.3359735
- code2vec: Learning Distributed Representations of Code. https://doi.org/10.48550/ARXIV.1803.09473
- Grounded Copilot: How Programmers Interact with Code-Generating Models. ArXiv abs/2206.15000 (2022).
- Tal Ben-Nun and Torsten Hoefler. 2019. Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis. ACM Comput. Surv. 52, 4, Article 65 (Aug. 2019), 43 pages. https://doi.org/10.1145/3320060
- GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 If you use this software, please cite it using these metadata..
- Language Models are Few-Shot Learners. CoRR abs/2005.14165 (2020). arXiv:2005.14165 https://arxiv.org/abs/2005.14165
- Evaluating Large Language Models Trained on Code. arXiv:arXiv:2107.03374
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423
- High-Order Curvilinear Finite Element Methods for Lagrangian Hydrodynamics. SIAM Journal on Scientific Computing 34, 5 (2012), B606–B641. https://doi.org/10.1137/120864672 arXiv:https://doi.org/10.1137/120864672
- Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black Magic? ArXiv abs/2210.14699 (2022).
- The Pile: An 800GB Dataset of Diverse Text for Language Modeling. CoRR abs/2101.00027 (2021). arXiv:2101.00027 https://arxiv.org/abs/2101.00027
- DeepDev-PERF: a deep learning-based approach for improving software performance. Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (2022).
- Aaron Gokaslan and Vanya Cohen. 2019. OpenWebText Corpus. http://Skylion007.github.io/OpenWebTextCorpus.
- Assemble Foundation Models for Automatic Code Summarization. 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) (2022), 935–946.
- Semantic Similarity Metrics for Evaluating Source Code Summarization. 2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC) (2022), 36–47.
- Learning to Reduce False Positives in Analytic Bug Detectors. 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE) (2022), 1307–1316.
- KRIPKE-A massively parallel transport mini-app. Lawrence Livermore National Laboratory (LLNL), Livermore, CA, Tech. Rep (2015).
- BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. https://doi.org/10.48550/ARXIV.1910.13461
- RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR abs/1907.11692 (2019). arXiv:1907.11692 http://arxiv.org/abs/1907.11692
- Ilya Loshchilov and Frank Hutter. 2017. Fixing Weight Decay Regularization in Adam. CoRR abs/1711.05101 (2017). arXiv:1711.05101 http://arxiv.org/abs/1711.05101
- Microsoft. [n. d.]. Deepspeed: Extreme-scale model training for everyone. https://www.microsoft.com/en-us/research/blog/deepspeed-extreme-scale-model-training-for-everyone/.
- A Survey and Empirical Evaluation of Parallel Deep Learning Frameworks. arXiv:arXiv:2111.04949
- Language Models are Unsupervised Multitask Learners. (2019).
- ZeRO-Offload: Democratizing Billion-Scale Model Training. CoRR abs/2101.06840 (2021). arXiv:2101.06840 https://arxiv.org/abs/2101.06840
- Cedric Richter and Heike Wehrheim. 2022. Can we learn from developer mistakes? Learning to localize and repair real bugs from real bug fixes. ArXiv abs/2207.00301 (2022).
- What is it like to program with artificial intelligence? ArXiv abs/2208.06213 (2022).
- Attention Is All You Need. CoRR abs/1706.03762 (2017). arXiv:1706.03762 http://arxiv.org/abs/1706.03762
- IR2V¡span class=”smallcaps SmallerCapital”¿EC¡/span¿: LLVM IR Based Scalable Program Embeddings. ACM Trans. Archit. Code Optim. 17, 4, Article 32 (dec 2020), 27 pages. https://doi.org/10.1145/3418463
- A Systematic Evaluation of Large Language Models of Code. https://doi.org/10.5281/zenodo.6363556 https://arxiv.org/abs/2202.13169.
- Daniel Nichols (10 papers)
- Aniruddha Marathe (5 papers)
- Harshitha Menon (12 papers)
- Todd Gamblin (13 papers)
- Abhinav Bhatele (33 papers)