Precise Issue-to-Execute Pipeline Staging for Fixed-Latency Instructions
Characterize the exact pipeline structure between the issue stage and operand-read/execute stages for fixed-latency instructions in NVIDIA Ampere GPUs, including the number and function of intermediate stages and their timing, so that a single model matches all observed experimental cases.
References
We performed a multitude of experiments to unveil the pipeline structure between issue and execute, and we could not find a model that perfectly fits all the experiments.
— Analyzing Modern NVIDIA GPU cores
(2503.20481 - Huerta et al., 26 Mar 2025) in Section 5.1.1 (Issue Scheduler: Warp Readiness)