Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Compiler generated feedback for Large Language Models (2403.14714v1)

Published 18 Mar 2024 in cs.PL and cs.LG
Compiler generated feedback for Large Language Models

Abstract: We introduce a novel paradigm in compiler optimization powered by LLMs with compiler feedback to optimize the code size of LLVM assembly. The model takes unoptimized LLVM IR as input and produces optimized IR, the best optimization passes, and instruction counts of both unoptimized and optimized IRs. Then we compile the input with generated optimization passes and evaluate if the predicted instruction count is correct, generated IR is compilable, and corresponds to compiled code. We provide this feedback back to LLM and give it another chance to optimize code. This approach adds an extra 0.53% improvement over -Oz to the original model. Even though, adding more information with feedback seems intuitive, simple sampling techniques achieve much higher performance given 10 or more samples.

Compiler-Generated Feedback Enhances LLMs for LLVM IR Optimization

Introduction to Feedback-Directed Optimization

LLMs have increasingly been deployed in the field of software engineering, demonstrating proficiency in tasks like code generation, code translation, and even optimization at the source code level. However, extending these capabilities into compiler optimizations, specifically leveraging LLMs for navigating the LLVM Intermediate Representation (IR) optimization space, presents novel challenges and opportunities. In this context, we explore an innovative approach that uses compiler-generated feedback to enhance the optimization process directed by LLMs.

Optimizing LLVM IR with LLMs

The foundational idea of optimizing LLVM IR with LLMs revolves around directing these models to suggest optimization strategies that the compiler then executes. The pivotal work we discuss pioneered the use of LLMs to recommend LLVM optimization passes based on the unoptimized IR, predicting not only the suitable optimization passes but also estimating the optimized IR's instruction count. This methodology demonstrated an improvement in reducing code size beyond the default -Oz optimization level provided by LLVM, thus highlighting the potential of LLMs in compiler optimization domains.

Compiler-Generated Feedback Mechanism

The main contribution of the discussed work is the introduction of a compiler-generated feedback mechanism that aims to sharpen the model's optimization recommendations. This mechanism operates in a cyclical fashion, where the model initially predicts optimization strategies and outcomes, which are then compiled and evaluated for their accuracy and effectiveness. Based on this evaluation, the model is provided with feedback regarding the validity of the pass list, the precision of instruction count predictions, and the compiled versus predicted IR's fidelity. This feedback loop allows the model to refine its predictions in subsequent iterations.

Feedback Variants

The paper explores three distinct forms of feedback:

  • Short Feedback: Incorporates metrics and error messages, focusing on compact, essential feedback.
  • Long Feedback: Extends Short Feedback by including the compiled IR, providing a detailed context for model refinement.
  • Fast Feedback: Prioritizes prompt generation speed by omitting metrics that require the generation of IR, offering a swift feedback cycle.

Experimental Insights and Evaluations

The research presents a comprehensive evaluation framework that compares the performance of feedback-directed LLMs against baseline models and explores the efficacy of various feedback forms. Key findings from these experiments include:

  • The feedback mechanism, particularly the Fast Feedback variant, achieved a noteworthy improvement over the baseline optimization model, enhancing the performance gain to 3.4% over the -Oz optimization level from an initial 2.87%.
  • Sampling techniques, wherein multiple optimization strategies are generated and evaluated, demonstrated substantial potential, with the baseline model achieving near-parity with expert-designed optimization sequences given sufficient samples.
  • While the iterative feedback approach showed promise, it could not outperform the more straightforward sampling strategy, indicating an area for future exploration and potential improvement.

Implications and Future Directions

The introduction of compiler-generated feedback for LLM-driven LLVM IR optimization opens new avenues for research and practical application. This mechanism not only enhances the model's optimization capabilities but also provides insights into the model's decision-making process, allowing for more targeted improvements. Future work may include refining the feedback mechanism, exploring its applicability to other aspects of compiler optimization, and integrating more advanced sampling strategies to further leverage the stochastic nature of LLMs in generating diverse optimization strategies.

Conclusion

The incorporation of compiler-generated feedback into the LLM-driven optimization process represents a significant step forward in harnessing the power of machine learning for compiler optimizations. By enabling models to refine their predictions through iterative feedback, this approach has the potential to drive noteworthy advancements in the efficiency and effectiveness of compiler optimizations, leading to more performant software and opening new frontiers in the integration of machine learning and compiler technology.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Dejan Grubisic (6 papers)
  2. Chris Cummins (23 papers)
  3. Volker Seeker (6 papers)
  4. Hugh Leather (23 papers)
Citations (2)