Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Automatically Tuning the GCC Compiler to Optimize the Performance of Applications Running on Embedded Systems (1703.08228v2)

Published 23 Feb 2017 in cs.DC and cs.PF

Abstract: This paper introduces a novel method for automatically tuning the selection of compiler flags to optimize the performance of software intended to run on embedded hardware platforms. We begin by developing our approach on code compiled by the GNU C Compiler (GCC) for the ARM Cortex-M3 (CM3) processor; and we show how our method outperforms the industry standard -O3 optimization level across a diverse embedded benchmark suite. First we quantify the potential gains by using existing iterative compilation approaches that time-intensively search for optimal configurations for each benchmark. Then we adapt iterative compilation to output a single configuration that optimizes performance across the entire benchmark suite. Although this is a time-consuming process, our approach constructs an optimized variation of -O3, which we call -Ocm3, that realizes nearly two thirds of known available gains on the CM3 and significantly outperforms a more complex state-of-the-art predictive method in cross-validation experiments. Finally, we demonstrate our method on additional platforms by constructing two more optimization levels that find even more significant speed-ups on the ARM Cortex-A8 and 8-bit AVR processors.

Citations (5)

Summary

We haven't generated a summary for this paper yet.