Improving Memory Dependence Prediction with Static Analysis (2403.08056v3)
Abstract: This paper explores the potential of communicating information gained by static analysis from compilers to Out-of-Order (OoO) machines, focusing on the memory dependence predictor (MDP). The MDP enables loads to issue without all in-flight store addresses being known, with minimal memory order violations. We use LLVM to find loads with no dependencies and label them via their opcode. These labelled loads skip making lookups into the MDP, improving prediction accuracy by reducing false dependencies. We communicate this information in a minimally intrusive way, i.e.~without introducing additional hardware costs or instruction bandwidth, providing these improvements without any additional overhead in the CPU. We find that in select cases in Spec2017, a significant number of load instructions can skip interacting with the MDP and lead to a performance gain. These results point to greater possibilities for static analysis as a source of near zero cost performance gains in future CPU designs.
- Reducing design complexity of the load/store queue. In Proc. MICRO-36, pages 411–422, 2003.
- Memory dependence prediction using store sets. In Proc. 25th ISCA, pages 142–153, 1998.
- Jason Lowe-Power et al. The gem5 Simulator: Version 20.0+. https://arxiv.org/abs/2007.03152, 2020.
- LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proc. CGO’04, 2004.
- Chains of Recurrences—a Method to Expedite the Evaluation of Closed-Form Functions. In Proc. ISSAC ’94, page 242–249, 1994.
- D. Novillo and R. H. Canada. Memory SSA - A Unified Approach for Sparsely Representing Memory Operations. In Proc of the GCC Developers’ Summit, 2007.
- Practical Dependence Testing. PLDI ’91, page 15–29, 1991.
- Using SimPoint for Accurate and Efficient Simulation. SIGMETRICS Perform. Eval. Rev., 31(1):318–319, Jun 2003.
- Valgrind. https://valgrind.org/.
- Flang Spec2017 Compilation Status. https://github.com/flang-compiler/f18-llvm-project/issues/1476.
- Efficient Vector Store System for Python using Shared Memory. In Proc. AIMLSystems ’22, 2023.
- Otto López. Memory Dependence Prediction Methods Study and Improvement Proposals. Master’s thesis, Universitat Politècnica de Catalunya, March 2011.
- Cost effective speculation with the omnipredictor. pages 1–13, 11 2018.
- Effective Context-Sensitive Memory Dependence Prediction. In 30th Symposium on High Performance Computer Architecture (HPCA), Edinburgh, Scotland, March 2024. IEEE Computer Society.
- Software-hardware cooperative memory disambiguation. In Proc. HPCA, 2006, pages 244–253, 2006.
- Feedback-Directed Memory Disambiguation through Store Distance Analysis. In Proc. ICS ’06, 2006.
- MLIR Affine Dialect. https://mlir.llvm.org/docs/Dialects/Affine/.