QUICK REVIEW

[论文解读] Comparing YOLOv11 and YOLOv8 for instance segmentation of occluded and non-occluded immature green fruits in complex orchard environment

Ranjan Sapkota, Manoj Karkee|arXiv (Cornell University)|Oct 24, 2024

Horticultural and Viticultural Research被引用 7

一句话总结

本文比较 YOLOv11 与 YOLOv8 在商业苹果园中对遮挡和非遮挡的未成熟绿色果实进行实例分割，强调 YOLO11m-seg 在掩码和框 mAP@50 上表现最好，而 YOLOv8n-seg 提供最快的推理。

ABSTRACT

This study conducted a comprehensive performance evaluation on YOLO11 (or YOLOv11) and YOLOv8, the latest in the "You Only Look Once" (YOLO) series, focusing on their instance segmentation capabilities for immature green apples in orchard environments. YOLO11n-seg achieved the highest mask precision across all categories with a notable score of 0.831, highlighting its effectiveness in fruit detection. YOLO11m-seg and YOLO11l-seg excelled in non-occluded and occluded fruitlet segmentation with scores of 0.851 and 0.829, respectively. Additionally, YOLOv11x-seg led in mask recall for all categories, achieving a score of 0.815, with YOLO11m-seg performing best for non-occluded immature green fruitlets at 0.858 and YOLOv8x-seg leading the occluded category with 0.800. In terms of mean average precision at a 50\% intersection over union (mAP@50), YOLOv11m-seg consistently outperformed, registering the highest scores for both box and mask segmentation, at 0.876 and 0.860 for the "All" class and 0.908 and 0.909 for non-occluded immature fruitlets, respectively. YOLO11l-seg and YOLOv8l-seg shared the top box mAP@50 for occluded immature fruitlets at 0.847, while YOLO11m-seg achieved the highest mask mAP@50 of 0.810. Despite the advancements in YOLO11, YOLOv8n surpassed its counterparts in image processing speed, with an impressive inference speed of 3.3 milliseconds, compared to the fastest YOLO11 series model at 4.8 milliseconds, underscoring its suitability for real-time agricultural applications related to complex green fruit environments. (YOLOv11 segmentation)

研究动机与目标

评估在商业苹果园中，YOLO11 和 YOLOv8 对遮挡与非遮挡的未成熟绿色苹果的实例分割性能。
在不同配置下比较精确度、召回率、F1 和 mAP@50。
评估模型复杂度（参数、GFLOPs、层数）以及训练/推理效率。
为实时农业部署提供建议。

提出的方法

使用机器人成像平台从 Scifresh 苹果园收集 991 张 RGB 图像的数据集。
对遮挡与非遮挡的果粒进行手动标注，并为 YOLO11/YOLOv8 进行数据集格式整理。
在相同超参数下训练并评估 YOLO11-seg 与 YOLOv8-seg 配置（输入 640x640，批次 8，IOU 0.7，300 轮，确定性种子）。
计算 MIoU、AP、mAP@50、mAR、F1 等指标，以及 GFLOPs、参数和卷积层。
评估训练时间和图像处理速度（预处理、推理、后处理）。

Figure 1: Architecture diagram of YOLOv8 algorithm: YOLOv8 advances real-time object detection with its innovative backbone and anchor-free Ultralytics head, optimizing detection accuracy and speed across various tasks (primary image source: https://yolov8.org/what-is-yolov8/).

实验结果

研究问题

RQ1哪种 YOLO-11 与 YOLOv8 配置在遮挡与非遮挡的未成熟绿色水果实例分割中提供更高的准确性？
RQ2在复杂的果园场景中，模型复杂度和推理速度如何与分割性能权衡？
RQ3哪种配置最适合实时农业应用？
RQ4遮挡与非遮挡的果粒如何影响各模型的精确度、召回率和 mAP@50？

主要发现

YOLO11n-seg 实现了最高的整体掩码精度（0.831）。
YOLO11m-seg 和 YOLO11l-seg 分别在非遮挡果粒（0.851）和遮挡果粒（0.829）上获得最高精度。
YOLO11x-seg 在所有类别中的掩码召回率领先（0.815）；YOLO11m-seg 在非遮挡召回率方面最佳（0.858）。
YOLOv8n-seg 在本研究中整体推理速度最快（3.3 ms）。
YOLO11m-seg 在 All 类别框的 mAP@50（0.876）和非遮挡框的 mAP@50（0.908）方面达到最高；掩码 mAP@50 在 All（0.860）和非遮挡（0.909）。
对于遮挡果粒，YOLO11m-seg 在掩码 mAP@50（0.810）上超越其他模型；框 mAP@50 在 YOLO11l-seg 与 YOLOv8l-seg 之间并列（0.847）。

Figure 2: Simplified architecture diagram of YOLO11 algorithm: YOLO11 builds on YOLOv8’s architecture with refined detection and segmentation efficiency and accuracy over benchmark dataset.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。