QUICK REVIEW

[论文解读] YOLO Evolution: A Comprehensive Benchmark and Architectural Review of YOLOv12, YOLO11, and Their Previous Versions

Nidhal Jegham, Chan Young Koh|arXiv (Cornell University)|Oct 31, 2024

Advanced Neural Network Applications被引用 46

一句话总结

一个全面的基准测试，比较 Ultralytics YOLO 从 v3 到 YOLO11 在三个数据集上的表现，详细给出准确率、速度、GFLOPs 和模型大小以指导模型选择。

ABSTRACT

This study presents a comprehensive benchmark analysis of various YOLO (You Only Look Once) algorithms. It represents the first comprehensive experimental evaluation of YOLOv3 to the latest version, YOLOv12, on various object detection challenges. The challenges considered include varying object sizes, diverse aspect ratios, and small-sized objects of a single class, ensuring a comprehensive assessment across datasets with distinct challenges. To ensure a robust evaluation, we employ a comprehensive set of metrics, including Precision, Recall, Mean Average Precision (mAP), Processing Time, GFLOPs count, and Model Size. Our analysis highlights the distinctive strengths and limitations of each YOLO version. For example: YOLOv9 demonstrates substantial accuracy but struggles with detecting small objects and efficiency whereas YOLOv10 exhibits relatively lower accuracy due to architectural choices that affect its performance in overlapping object detection but excels in speed and efficiency. Additionally, the YOLO11 family consistently shows superior performance maintaining a remarkable balance of accuracy and efficiency. However, YOLOv12 delivered underwhelming results, with its complex architecture introducing computational overhead without significant performance gains. These results provide critical insights for both industry and academia, facilitating the selection of the most suitable YOLO algorithm for diverse applications and guiding future enhancements.

研究动机与目标

评估从 v3 到 YOLO11 在多样数据集（交通标志、非洲野生动物、船只）上的表现。
评估除了 mAP 外的多项指标，包括精确度、召回率、预处理、推理和后处理时间、GFLOPs 与模型大小。
分析架构演变以解释各版本在准确性和效率上的差异。

提出的方法

在 5 个 YOLO 版本中以一致的超参数对 23 个模型进行基准测试。
在具有不同目标尺寸和纵横比的数据集上进行评估。
测量精确度、召回率、mAP50、mAP50-95、预处理、推理、后处理时间、GFLOPs 与模型大小。
在可用的情况下，将 Ultralytics 的实现与原始 YOLO 对应版本进行比较。
讨论架构变更（C2PSA、C3k2、无锚点方法、无 NMS 训练）及其对性能的影响。

Figure 1: Evolution of YOLO Algorithms throughout the years.

实验结果

研究问题

RQ1YOLO11 及其前代在不同数据集上的准确性、速度和效率如何比较？
RQ2在各 YOLO 版本中的哪些架构变动驱动了观察到的性能差异？
RQ3每个版本在准确性（mAP）和效率（GFLOPs、模型大小）之间存在哪些权衡？

主要发现

Version	Precision	Recall	mAP50	mAP50-95	Preprocess Time (s)	Inference Time (s)	Postprocess Time (s)	Total Time (s)	GFLOPs	Size (MB)
YOLOv3u	0.75	0.849	0.874	0.781	0.7	8.5	0.4	9.6	207.86	282.4
YOLOV3u tiny	0.845	0.667	0.772	0.682	1.4	0.7	0.3	2.4	24.44	19
YOLOv5un	0.805	0.679	0.749	0.665	0.6	6.6	0.4	7.6	5.65	7.1
YOLOv5us	0.85	0.777	0.827	0.744	0.5	7.8	0.4	8.7	18.58	23.9
YOLOv5um	0.849	0.701	0.83	0.744	1.1	9.5	0.4	11	50.54	64.1
YOLOv5ul	0.831	0.836	0.886	0.799	0.6	9.7	0.4	10.7	106.85	134.9
YOLOv5ux	0.863	0.795	0.867	0.777	1.1	9.8	0.4	11.3	195.2	246.3
YOLOv8n	0.749	0.688	0.777	0.689	0.6	6.8	0.4	7.8	6.55	8.1
YOLOv8s	0.766	0.788	0.806	0.718	0.6	7.8	0.4	8.8	22.59	28.6
YOLOv8m	0.838	0.805	0.845	0.763	1.6	9.1	0.4	11.1	52.12	78.9
YOLOv8l	0.771	0.789	0.853	0.767	0.6	9.2	0.4	10.2	87.77	165
YOLOv8x	0.902	0.744	0.874	0.78	0.6	9.4	0.4	10.4	136.9	257.7
YOLOv9t	0.792	0.748	0.812	0.731	0.5	10	0.4	10.9	4.93	7.7
YOLOv9s	0.763	0.81	0.828	0.75	0.6	11.1	0.4	12.1	15.33	26.8
YOLOv9m	0.864	0.796	0.864	0.784	1	12.1	0.4	13.5	40.98	76.7
YOLOv9c	0.827	0.807	0.852	0.769	1.3	11.6	0.4	13.3	51.8	102.6
YOLOv9e	0.819	0.824	0.854	0.764	0.8	16.1	0.4	17.3	117.5	189.4
YOLOv10n	0.722	0.602	0.722	0.64	1	0.8	0.2	2	5.59	8.3
YOLOv10s	0.823	0.742	0.834	0.744	1.2	1.1	0.2	2.5	15.9	24.7
YOLOv10m	0.834	0.843	0.88	0.781	1.2	2.4	0.2	3.8	32.1	63.8
YOLOv10b	0.836	0.764	0.859	0.765	1	3.1	0.2	4.3	39.7	98.4
YOLOv10l	0.873	0.807	0.866	0.771	1.1	3.8	0.2	5.1	50	126.8
YOLOv10x	0.773	0.854	0.88	0.787	1	6.3	0.2	7.5	61.4	170.4
YOLO11n	0.768	0.695	0.757	0.668	1.2	0.6	0.4	2.2	5.35	6.4
YOLO11s	0.819	0.758	0.838	0.742	1.2	1	0.4	2.6	18.4	21.4
YOLO11m	0.898	0.826	0.893	0.795	1.2	2.4	0.4	4	38.8	67.9
YOLO11l	0.862	0.839	0.889	0.794	1.2	3	0.4	4.6	49	86.8
YOLO11x	0.819	0.816	0.885	0.784	0.9	6.1	0.4	7.4	109	194.8

YOLO11 家族在不同数据集上在准确性、速度、效率和模型大小方面表现优越。
YOLO11m 在 Traffic Signs、Africa Wildlife、Ships 三个数据集的 mAP50-95 分别为 0.795、0.81 和 0.325，平均推理时间 2.4 ms，平均模型大小 38.8 MB。
YOLOv9 提供较高的准确性但在小目标检测和效率方面存在挑战，而 YOLOv10 通过架构选择提高速度与效率，且有助于覆盖重叠目标的检测。
Ultralytics 支持的 YOLOv3u、YOLOv5un、YOLOv5us、YOLOv5ul、YOLOv8x、YOLOv9m/e、YOLOv10l/x，以及 YOLO11 变体在权衡方面各有不同，因优化因素与原始版本直接比较可能不公平。
研究通过使用相同的超参数并聚焦于 Ultralytics 支持的模型，提供了一个公平的基准。

Figure 2: YOLOv3 architecture showcasing the residual blocks and the upsampling layers to enhance object detection efficiency through different scales [ 9 ] .

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。