Skip to main content
QUICK REVIEW

[Paper Review] YOLOv1 to YOLOv10: A comprehensive review of YOLO variants and their application in the agricultural domain

Mujadded Al Rabbani Alif, Muhammad Hussain|arXiv (Cornell University)|Jun 14, 2024
Plant Virus Research Studies49 citations
TL;DR

This paper surveys YOLO variants from v1 to v10 and examines their potential and applications in agriculture, including performance insights and future trends.

ABSTRACT

This survey investigates the transformative potential of various YOLO variants, from YOLOv1 to the state-of-the-art YOLOv10, in the context of agricultural advancements. The primary objective is to elucidate how these cutting-edge object detection models can re-energise and optimize diverse aspects of agriculture, ranging from crop monitoring to livestock management. It aims to achieve key objectives, including the identification of contemporary challenges in agriculture, a detailed assessment of YOLO's incremental advancements, and an exploration of its specific applications in agriculture. This is one of the first surveys to include the latest YOLOv10, offering a fresh perspective on its implications for precision farming and sustainable agricultural practices in the era of Artificial Intelligence and automation. Further, the survey undertakes a critical analysis of YOLO's performance, synthesizes existing research, and projects future trends. By scrutinizing the unique capabilities packed in YOLO variants and their real-world applications, this survey provides valuable insights into the evolving relationship between YOLO variants and agriculture. The findings contribute towards a nuanced understanding of the potential for precision farming and sustainable agricultural practices, marking a significant step forward in the integration of advanced object detection technologies within the agricultural sector.

Motivation & Objective

  • Trace the evolutionary advancements of YOLO variants from v1 to v10.
  • Assess how YOLO variants can address agricultural challenges such as crop monitoring and livestock management.
  • Synthesize existing research on YOLO in agriculture and identify gaps and future directions.
  • Analyze performance trends and practical implications for precision farming and sustainable agriculture.

Proposed method

  • Review the historical progression of YOLO architectures from v1 through v10.
  • Summarize architectural, training, and optimization changes across variants (e.g., v2/v3/v4/v5/v6/v7).
  • Highlight key performance metrics and real-time capabilities reported in the literature (e.g., AP, FPS, mAP) and how they relate to agricultural tasks.
  • Discuss datasets and data augmentation strategies used to improve YOLO performance (e.g., COCO, ImageNet, anchor boxes, CIoU/Loss).
  • Analyze how YOLO variants have been positioned for agricultural applications such as crop monitoring, disease/pest detection, yield estimation, and livestock management.
  • Project future trends and potential impacts of advancing YOLO architectures on precision farming.
Figure 1: The General structure of a CNN, highlighting convolutional layers, pooling, and fully connected layers.
Figure 1: The General structure of a CNN, highlighting convolutional layers, pooling, and fully connected layers.

Experimental results

Research questions

  • RQ1What are the architectural and methodological advancements across YOLO variants from v1 to v10?
  • RQ2How do YOLO variants perform in agricultural contexts in terms of accuracy, speed, and robustness?
  • RQ3What agricultural challenges can YOLO variants address (e.g., small objects, occlusions, real-time monitoring)?
  • RQ4What are the data, training, and deployment considerations for applying YOLO models in agriculture?
  • RQ5What future directions and trends are likely to shape YOLO-driven precision farming?

Key findings

  • YOLOv3 introduced multi-scale detection and binary cross-entropy with an extended backbone (Darknet-53) and improved AP, surpassing prior models at 20 FPS.
  • YOLOv4 integrated CSPDarknet53, SPP, PANet, and CIoU loss to boost localization and overall performance.
  • YOLOv5, implemented in PyTorch, uses CSPNet, SPP blocks, PAN neck, and CIoU-based loss, achieving higher AP at various input sizes (e.g., AP50 60.6% at 416x416 and AP 36.2% on COCO with 20 FPS for earlier variants); table shows multiple variant results.
  • YOLOv6 introduces CSPDarknet backbone, FPN, and separation of classification and box-regression heads, achieving AP around 52.5% and AP50 70% on COCO test-dev 2017 with ~50 FPS on T4 GPU.
  • YOLOv7 emphasizes efficiency with E-ELAN, scalable model sizing, and bag-of-freebies strategies, improving accuracy and speed over predecessors.
  • The survey positions YOLO variants as promising for real-time agriculture tasks such as crop monitoring, disease/pest detection, yield estimation, and livestock management, while noting challenges like small object detection, occlusions, and data limitations.
Figure 2: Single and multiple objects in an image: Classification, Localization, Segmentation.
Figure 2: Single and multiple objects in an image: Classification, Localization, Segmentation.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.