Accelerating Elliptic Curve Point Additions on Versal AI Engine for Multi-scalar Multiplication (2502.11660v1)
Abstract: Multi-scalar multiplication (MSM) is crucial in cryptographic applications and computationally intensive in zero-knowledge proofs. MSM involves accumulating the products of scalars and points on an elliptic curve over a 377-bit modulus, and the Pippenger algorithm converts MSM into a series of elliptic curve point additions (PADDs) with high parallelism. This study investigates accelerating MSM on the Versal ACAP platform, an emerging hardware that employs a spatial architecture integrating 400 AI Engines (AIEs) with programmable logic and a processing system. AIEs are SIMD-based VLIW processors capable of performing vector multiply-accumulate operations, making them well-suited for multiplication-heavy workloads in PADD. Unlike simpler multiplication tasks in previous studies, cryptographic computations also require complex operations such as carry propagation. These operations necessitate architecture-aware optimizations, including intra-core dedicated coding style to fully exploit VLIW capabilities and inter-core strategy for spatial task mapping. We propose various optimizations to accelerate PADDs, including (1) algorithmic optimizations for carry propagation employing a carry-save-like technique to exploit VLIW and SIMD capabilities and (2) a comparison of four distinct spatial mappings to enhance intra- and inter-task parallelism. Our approach achieves a computational efficiency that utilizes 50.2% of the theoretical memory bandwidth and provides 568 speedup over the integrated CPU on the AIE evaluation board.
- Ayumi Ohno (2 papers)
- Kotaro Shimamura (2 papers)
- Shinya Takamaeda-Yamazaki (13 papers)