Dice Question Streamline Icon: https://streamlinehq.com

Precise Issue-to-Execute Pipeline Staging for Fixed-Latency Instructions

Characterize the exact pipeline structure between the issue stage and operand-read/execute stages for fixed-latency instructions in NVIDIA Ampere GPUs, including the number and function of intermediate stages and their timing, so that a single model matches all observed experimental cases.

Information Square Streamline Icon: https://streamlinehq.com

Background

To explain observed behavior, the paper proposes a Control stage followed by an Allocate stage between issue and operand-read for fixed-latency instructions, but acknowledges some experimental cases are not perfectly matched.

The inability to find a model that fits all experiments leaves the exact pipeline staging unresolved, motivating a definitive characterization.

References

We performed a multitude of experiments to unveil the pipeline structure between issue and execute, and we could not find a model that perfectly fits all the experiments.

Analyzing Modern NVIDIA GPU cores (2503.20481 - Huerta et al., 26 Mar 2025) in Section 5.1.1 (Issue Scheduler: Warp Readiness)