Verifying Peephole Rewriting In SSA Compiler IRs (2407.03685v1)

Published 4 Jul 2024 in cs.PL and cs.LO

Abstract: There is an increasing need for domain-specific reasoning in modern compilers. This has fueled the use of tailored intermediate representations (IRs) based on static single assignment (SSA), like in the MLIR compiler framework. Interactive theorem provers (ITPs) provide strong guarantees for the end-to-end verification of compilers (e.g., CompCert). However, modern compilers and their IRs evolve at a rate that makes proof engineering alongside them prohibitively expensive. Nevertheless, well-scoped push-button automated verification tools such as the Alive peephole verifier for LLVM-IR gained recognition in domains where SMT solvers offer efficient (semi) decision procedures. In this paper, we aim to combine the convenience of automation with the versatility of ITPs for verifying peephole rewrites across domain-specific IRs. We formalize a core calculus for SSA-based IRs that is generic over the IR and covers so-called regions (nested scoping used by many domain-specific IRs in the MLIR ecosystem). Our mechanization in the Lean proof assistant provides a user-friendly frontend for translating MLIR syntax into our calculus. We provide scaffolding for defining and verifying peephole rewrites, offering tactics to eliminate the abstraction overhead of our SSA calculus. We prove correctness theorems about peephole rewriting, as well as two classical program transformations. To evaluate our framework, we consider three use cases from the MLIR ecosystem that cover different levels of abstractions: (1) bitvector rewrites from LLVM, (2) structured control flow, and (3) fully homomorphic encryption. We envision that our mechanization provides a foundation for formally verified rewrites on new domain-specific IRs.

Authors (5)

Siddharth Bhat (9 papers)
Alex Keizer (1 paper)
Chris Hughes (5 papers)
Tobias Grosser (21 papers)
Andrés Goens (5 papers)

Summary

Verifying Peephole Rewriting in SSA Compiler IRs

The paper "Verifying Peephole Rewriting in SSA Compiler IRs," authored by Siddharth Bhat, Alex Keizer, Chris Hughes, André Goens, and Tobias Grosser, addresses the increasingly complex domain-specific reasoning required in modern compilers and presents a well-defined approach for verifying peephole rewrites using intermediate representations (IRs) based on Static Single Assignment (SSA).

Introduction to SSA and Peephole Rewrites

Static Single Assignment (SSA) form is a compiler IR pivotal to the effectiveness of modern compiler optimization techniques. In SSA, each variable is assigned exactly once, simplifying the analysis of data flow. This clarity aids optimizations such as peephole rewrites, which refactor short sequences of instructions into semantically equivalent but more efficient sequences. While traditional SSA-based verifications, such as those in LLVM's Alive, leverage SMT solvers for their automation and efficiency, they fall short on handling the rapid evolution of compilers and domain-specific IRs, thus calling for a nuanced approach that combines the robustness of Interactive Theorem Provers (ITPs).

Contributions and Framework Overview

The paper's primary contribution is introducing a framework for verifying SSA-based peephole rewriting, which leverages the capabilities of the Lean proof assistant. The core calculus for SSA-based IRs is designed to be generic, covering various domain-specific needs in the MLIR ecosystem.

Key Contributions:

Formalization of a Core Calculus for SSA-based IRs:
- Introduces a framework for IRs parameterized over arbitrary user-defined IRs, including the concept of regions which manage nested scoping operations in modern compiler IRs like MLIR.
Mechanization in Lean:
- The framework translates the MLIR syntax into a core calculus, with a scaffolding mechanism to define and verify peephole rewrites.
- Provides automation via tactics that minimize abstraction overhead, simplifying the mechanical verification process.
Verification of Correctness:
- Proves correctness theorems for peephole rewriting and demonstrates their validity across domain-specific IRs by covering scenarios from logical bitvector manipulations to fully homomorphic encryption.
Handling of Side Effects:
- Extends the pure optimization framework to address scenarios involving side effects, crucial for practical compilation tasks.

Evaluations and Case Studies

To validate their framework, the authors consider three distinct use cases within the MLIR ecosystem:

Arithmetic Bitvector Rewrites:
- Models the LLVM arithmetic operations with a focus on generalization across arbitrary bit widths. This extends traditional tool capabilities which are limited to fixed widths, demonstrating successful proof automation on the Alive test suite.
Structured Control Flow:
- Parametrizes scf operations over existing IRs, enabling reasoning about complex control structures such as nested if conditions and bounded for loops. The clear separation of pure and region-based computations aids in proving canonical transformations over loops, enhancing optimization potential.
Fully Homomorphic Encryption (FHE):
- Adapts the framework to reason about complex algebraic structures inherent in FHE compilers. The case paper includes algebraically complex rewrites in the 'Poly' IR derived from the mathematical field $\mathbb{Z}/q\mathbb{Z}[X]/(X^{2^n} + 1)$ , showcasing the framework's robustness for high-level mathematical abstractions.

Implications and Future Work

The presented framework provides both theoretical and practical advancements. Theoretical implications include a deeper understanding of the mechanization of SSA-based peephole rewriting while preserving semantic equivalences. Practically, it offers a scalable, automated tool for compiler developers, facilitating the integration of formal verification in day-to-day compiler development workflows.

Future developments may consider extending the framework to deeper representations of side-effects and the incorporation of richer proof obligations to model non-trivial semantic properties. Additionally, interfacing with state-of-the-art SAT solvers could push forward the bounds of automated proof capabilities.

Conclusion

"Verifying Peephole Rewriting in SSA Compiler IRs" introduces a robust, verified approach to optimizing within SSA frameworks. By marrying the formal rigor of ITPs with practical automation, this work promises to streamline sophisticated optimizations in next-generation compilers, paving the way for more reliable and efficient domain-specific IRs.

References

$zhao2013formal, barthe2014formal, lopes2015provably, lopes2021alive2, lattner2004llvm, lattner2020mlir, demoura2021lean, mullen2016verified, kelsey1995correspondence, appel1998ssa.$

PDF Markdown

Related Papers

Tweets

https://twitter.com/Jose_A_Alonso/status/1810959269648355479

https://twitter.com/LiCSpreprintBot/status/1810272381770105152

HackerNews

Verifying Peephole Rewriting in SSA Compiler IRs (3 points, 0 comments)

Reddit

Verifying Peephole Rewriting In SSA Compiler IRs (6 points, 0 comments)