Reproducibility of YOLOv8/YOLOv11 mAP in TensorRT under NMS mismatch
Determine whether the discrepancy between multi-class non-maximum suppression (NMS) used during evaluation and single-class NMS used during inference is responsible for the inability to reproduce the reported mean average precision (mAP) of the Ultralytics YOLOv8 and YOLOv11 object detectors when executed with NVIDIA TensorRT, and establish a reproducible TensorRT evaluation configuration that matches the original mAP results for these models.
References
We are unable to reproduce YOLOv8 and YOLOv11's mAP results in TensorRT, likely because these models evaluate with multi-class NMS but only use single-class NMS in inference.
— RF-DETR: Neural Architecture Search for Real-Time Detection Transformers
(2511.09554 - Robinson et al., 12 Nov 2025) in Table 1 caption, Experiments (Standardizing Latency Evaluation)