YOLO Evolution: A Comprehensive Benchmark and Architectural Review of YOLOv12, YOLO11, and Their Previous Versions

Published 31 Oct 2024 in cs.CV | (2411.00201v4)

Abstract: This study presents a comprehensive benchmark analysis of various YOLO (You Only Look Once) algorithms. It represents the first comprehensive experimental evaluation of YOLOv3 to the latest version, YOLOv12, on various object detection challenges. The challenges considered include varying object sizes, diverse aspect ratios, and small-sized objects of a single class, ensuring a comprehensive assessment across datasets with distinct challenges. To ensure a robust evaluation, we employ a comprehensive set of metrics, including Precision, Recall, Mean Average Precision (mAP), Processing Time, GFLOPs count, and Model Size. Our analysis highlights the distinctive strengths and limitations of each YOLO version. For example: YOLOv9 demonstrates substantial accuracy but struggles with detecting small objects and efficiency whereas YOLOv10 exhibits relatively lower accuracy due to architectural choices that affect its performance in overlapping object detection but excels in speed and efficiency. Additionally, the YOLO11 family consistently shows superior performance maintaining a remarkable balance of accuracy and efficiency. However, YOLOv12 delivered underwhelming results, with its complex architecture introducing computational overhead without significant performance gains. These results provide critical insights for both industry and academia, facilitating the selection of the most suitable YOLO algorithm for diverse applications and guiding future enhancements.

Abstract PDF HTML Upgrade to Chat

References (70)

Citations (2)

View on Semantic Scholar

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Glossary

off on

Practical Applications

off on

Conceptual Simplification

off on

Explain it Like I'm 14

Overview

This paper compares many versions of a popular computer vision tool called YOLO (short for “You Only Look Once”). YOLO is used to find and label objects in pictures or videos, like spotting a stop sign on the road, a zebra in a photo, or a ship in satellite images. The paper especially focuses on testing the newest version, YOLO11, against earlier versions (like YOLOv3, v5, v8, v9, and v10) to see which ones are the most accurate and the fastest.

What questions did the researchers ask?

The researchers wanted to answer simple, practical questions:

Which YOLO version is the most accurate at finding objects?
Which versions run the fastest and use the least computer power?
How do different YOLO versions handle hard situations, like very small objects (tiny ships), overlapping objects (animals in wildlife photos), or many classes (lots of types of traffic signs)?
Is YOLO11 better than older versions, and if so, in what ways?

How did they do the study?

To keep things fair, the team used the same training setup and rules for all models and tested them on three different, challenging datasets:

Traffic Signs: Many types and sizes of signs; tricky because small signs can look similar.
African Wildlife: Four animal classes (buffalo, elephant, rhino, zebra); images often have overlapping animals.
Ships and Vessels: One class (“ship”); ships are small and can be rotated, which is hard for detectors.

They used models from the Ultralytics YOLO library so the setup was consistent. For each model, they measured:

Accuracy: Precision (how many detections are correct), Recall (how many real objects are found), mAP50 (overall accuracy at a medium strictness), and mAP50-95 (accuracy across very strict matching levels).
Speed: Preprocessing time (getting images ready), Inference time (the model’s “thinking” time), and Postprocessing time (cleaning up predictions). Think of this like a stopwatch for every step.
Computational load: GFLOPs (how much math the model needs) and model size in megabytes (like how big the app is on disk).

They trained and fine-tuned each model in the same way, then compared the results.

Main findings and why they matter

Big picture:

YOLO11 models (especially YOLO11m) performed best overall, balancing accuracy, speed, and size.
YOLOv9 was very accurate but struggled with small objects and was less efficient.
YOLOv10 was extremely fast and efficient, but its accuracy dropped in trickier cases like overlapping objects, because of its architectural choices (it avoids a common cleanup step called NMS by using a different training strategy).
Older models improved over time, but the newest versions generally do better across tasks.

Highlights:

YOLO11m was the “best balance” model. On average, it was both accurate and fast:
- mAP50-95 scores: 0.795 (Traffic Signs), 0.81 (African Wildlife), 0.325 (Ships). The ships score is lower because small objects are hard, but 0.325 is still competitive for tiny targets.
- Speed: about 2.4 milliseconds per image (very fast).
- Size: about 38.8 MB (not too large).
- GFLOPs: around 67.6 (a reasonable amount of computation).
YOLOv10 stood out for speed and efficiency. If you need real-time results with limited hardware, it’s excellent, but it can miss more objects when they overlap.
YOLOv9 was strong in accuracy, especially on simpler or larger objects, but used more compute and wasn’t as good at tiny objects.
Across all tests, YOLO11’s new building blocks (C3k2 and C2PSA) helped the model pay attention to the right parts of the image—like focusing on small or overlapping objects more effectively.

Why this matters:

Traffic signs: You need high accuracy and speed for self-driving and road safety. YOLO11m and YOLO11l did very well here.
Wildlife: Overlapping animals are tough. YOLO9 and YOLO11 families showed strong results, but large models can overfit on small datasets.
Ships: Tiny, rotated ships are tricky. YOLO11 still did well relative to the challenge, and its speed helps for scanning large satellite images quickly.

What’s the impact?

This study helps engineers, researchers, and companies pick the right YOLO version for their needs:

If you want the best all-around performer, choose YOLO11m: it’s fast, accurate, and not too heavy.
If your system is very limited (like a small device), YOLOv10’s speed is a big win.
If your task focuses on accuracy and you can afford more compute, consider YOLOv9 or larger YOLO11 variants.

For future work, the results suggest:

Keep improving detection of very small objects and overlapping objects.
Keep making models both faster and smarter, especially for real-time applications.
Use fair, consistent benchmarks (like in this paper) to compare new versions.

In short, YOLO has improved a lot over the years, and YOLO11 shows the best mix of accuracy and speed so far, making it a strong choice for many real-world tasks.

View Paper Prompt View All Prompts

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Continue Learning

We haven't generated follow-up questions for this paper yet.

Generate Now

YOLO Evolution: A Comprehensive Benchmark and Architectural Review of YOLOv12, YOLO11, and Their Previous Versions

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

Overview

What questions did the researchers ask?

How did they do the study?

Main findings and why they matter

What’s the impact?

Open Problems

Continue Learning

Authors (4)

Collections

Tweets

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research

YOLO Evolution: A Comprehensive Benchmark and Architectural Review of YOLOv12, YOLO11, and Their Previous Versions

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

Overview

What questions did the researchers ask?

How did they do the study?

Main findings and why they matter

What’s the impact?

Open Problems

Continue Learning

Related Papers

Authors (4)

Collections

Tweets

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research