QUICK REVIEW

[論文レビュー] Comparing YOLOv11 and YOLOv8 for instance segmentation of occluded and non-occluded immature green fruits in complex orchard environment

Ranjan Sapkota, Manoj Karkee|arXiv (Cornell University)|Oct 24, 2024

Horticultural and Viticultural Research被引用数 7

ひとこと要約

この論文は、商用の果樹園における遮蔽有/無の immature green fruits のインスタンス分割において YOLOv11 と YOLOv8 を比較し、マスクと box mAP@50 のトップは YOLO11m-seg、推論速度の最速は YOLOv8n-seg であることを指摘します。

ABSTRACT

This study conducted a comprehensive performance evaluation on YOLO11 (or YOLOv11) and YOLOv8, the latest in the "You Only Look Once" (YOLO) series, focusing on their instance segmentation capabilities for immature green apples in orchard environments. YOLO11n-seg achieved the highest mask precision across all categories with a notable score of 0.831, highlighting its effectiveness in fruit detection. YOLO11m-seg and YOLO11l-seg excelled in non-occluded and occluded fruitlet segmentation with scores of 0.851 and 0.829, respectively. Additionally, YOLOv11x-seg led in mask recall for all categories, achieving a score of 0.815, with YOLO11m-seg performing best for non-occluded immature green fruitlets at 0.858 and YOLOv8x-seg leading the occluded category with 0.800. In terms of mean average precision at a 50\% intersection over union (mAP@50), YOLOv11m-seg consistently outperformed, registering the highest scores for both box and mask segmentation, at 0.876 and 0.860 for the "All" class and 0.908 and 0.909 for non-occluded immature fruitlets, respectively. YOLO11l-seg and YOLOv8l-seg shared the top box mAP@50 for occluded immature fruitlets at 0.847, while YOLO11m-seg achieved the highest mask mAP@50 of 0.810. Despite the advancements in YOLO11, YOLOv8n surpassed its counterparts in image processing speed, with an impressive inference speed of 3.3 milliseconds, compared to the fastest YOLO11 series model at 4.8 milliseconds, underscoring its suitability for real-time agricultural applications related to complex green fruit environments. (YOLOv11 segmentation)

研究の動機と目的

商用の果樹園における遮蔽有・無の immature green アップルのインスタンス分割性能を YOLO11 および YOLOv8 で評価する。
設定間で precision、recall、F1、および mAP@50 を比較する。
モデルの複雑さ（パラメータ、GFLOPs、レイヤー数）と訓練/推論の効率を評価する。
リアルタイム農業応用への推奨事項を提供する。

提案手法

ロボット撮影プラットフォームを用いて Scifresh 果樹園から 991 枚の RGB 画像を収集。
遮蔽有/無の果実の手動アノテーションと YOLO11/YOLOv8 用データセット整形。
同一ハイパーパラメータ（入力 640x640、バッチ 8、IOU 0.7、300 エポック、決定論的シード）で YOLO11-seg および YOLOv8-seg の設定を訓練・評価する。
MIoU、AP、mAP@50、mAR、F1 の指標とともに GFLOPs、パラメータ、畳み込みレイヤーを計算する。
前処理、推論、後処理を含む訓練時間と画像処理速度を評価する。

Figure 1: Architecture diagram of YOLOv8 algorithm: YOLOv8 advances real-time object detection with its innovative backbone and anchor-free Ultralytics head, optimizing detection accuracy and speed across various tasks (primary image source: https://yolov8.org/what-is-yolov8/).

実験結果

リサーチクエスチョン

RQ1遮蔽有/無の immature green 果実に対して、どの YOLO-11 と YOLOv8 の設定がより優れたインスタンス分割精度を提供するか？
RQ2複雑な果樹園の風景において、モデルの複雑さと推論速度は分割性能とどのようにトレードオフされるか？
RQ3リアルタイム農業用途に最も適した設定はどれか？
RQ4遮蔽有/無の果実はモデル間で precision、recall、および mAP@50 にどのような影響を与えるか？

主な発見

YOLO11n-seg は全体的なマスク精度（0.831）で最高を達成。
YOLO11m-seg および YOLO11l-seg は、それぞれ非遮蔽果実での最高精度（0.851）、遮蔽果実での最高精度（0.829）を達成。
YOLO11x-seg はカテゴリ全体でマスクリコール（0.815）を牽引；YOLO11m-seg は非遮蔽リコールで最も高く（0.858）。
YOLOv8n-seg は本研究で最速の推論速度（3.3 ms）を示した。
YOLO11m-seg は All クラスのボックス mAP@50（0.876）および非遮蔽ボックス（0.908）で最高を達成；マスク mAP@50 は All（0.860）と非遮蔽（0.909）。
遮蔽果実について、YOLO11m-seg はマスク mAP@50（0.810）で他を上回り；ボックス mAP@50 は YOLO11l-seg と YOLOv8l-seg の間で同点（0.847）。

Figure 2: Simplified architecture diagram of YOLO11 algorithm: YOLO11 builds on YOLOv8’s architecture with refined detection and segmentation efficiency and accuracy over benchmark dataset.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。