Performance Evaluation of General Purpose Large Language Models for Basic Linear Algebra Subprograms Code Generation (2507.04697v1)

Published 7 Jul 2025 in cs.LG, cs.DC, and cs.MS

Abstract: Generative AI technology based on LLMs (LLM) has been developed and applied to assist or automatically generate program codes. In this paper, we evaluate the capability of existing general LLMs for Basic Linear Algebra Subprograms (BLAS) code generation for CPUs. We use two LLMs provided by OpenAI: GPT-4.1, a Generative Pre-trained Transformer (GPT) model, and o4-mini, one of the o-series of Reasoning models. Both have been released in April 2025. For the routines from level-1 to 3 BLAS, we tried to generate (1) C code without optimization from routine name only, (2) C code with basic performance optimizations (thread parallelization, SIMD vectorization, and cache blocking) from routine name only, and (3) C code with basic performance optimizations based on Fortran reference code. As a result, we found that correct code can be generated in many cases even when only routine name are given. We also confirmed that thread parallelization with OpenMP, SIMD vectorization, and cache blocking can be implemented to some extent, and that the code is faster than the reference code.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Performance Evaluation of General Purpose Large Language Models for Basic Linear Algebra Subprograms Code Generation (2507.04697v1)

Summary

Follow-up Questions

Related Papers