Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Hardware Counted Profile-Guided Optimization (1411.6361v1)

Published 24 Nov 2014 in cs.PL

Abstract: Profile-Guided Optimization (PGO) is an excellent means to improve the performance of a compiled program. Indeed, the execution path data it provides helps the compiler to generate better code and better cacheline packing. At the time of this writing, compilers only support instrumentation-based PGO. This proved effective for optimizing programs. However, few projects use it, due to its complicated dual-compilation model and its high overhead. Our solution of sampling Hardware Performance Counters overcome these drawbacks. In this paper, we propose a PGO solution for GCC by sampling Last Branch Record (LBR) events and using debug symbols to recreate source locations of binary instructions. By using LBR-Sampling, the generated profiles are very accurate. This solution achieved an average of 83% of the gains obtained with instrumentation-based PGO and 93% on C++ benchmarks only. The profiling overhead is only 1.06% on average whereas instrumentation incurs a 16% overhead on average.

Citations (8)

Summary

We haven't generated a summary for this paper yet.