Cross-CTA Warp Scheduling Behavior under CGGTY
Determine the issue scheduling behavior of NVIDIA Ampere GPUs when selecting among warps belonging to different CTAs, and establish whether the Compiler Guided Greedy Then Youngest (CGGTY) policy holds across CTAs or is modified by CTA-level constraints.
Sponsor
References
However, we have only confirmed this behavior for warps within the same CTA, as we have not yet devised a reliable methodology to analyze interactions among warps from different CTAs.
— Analyzing Modern NVIDIA GPU cores
(2503.20481 - Huerta et al., 26 Mar 2025) in Section 5.1.2 (Issue Scheduler: Scheduling Policy)