Cause of Constant Cache Latency Anomalies
Explain and validate the architectural cause for the observation that, on NVIDIA Ampere GPUs, constant cache accesses exhibit significantly greater WAR latency than global memory loads while showing slightly lower RAW/WAW latency, and reconcile these behaviors with cache hierarchy usage for fixed-latency versus LDC-based accesses.
References
We could not confirm any hypothesis that explains this observation.
— Analyzing Modern NVIDIA GPU cores
(2503.20481 - Huerta et al., 26 Mar 2025) in Section 5.4 (Memory Pipeline)