QUICK REVIEW

[論文レビュー] Comparing YOLOv8 and Mask R-CNN for instance segmentation in complex orchard environments

Ranjan Sapkota, Dawood Ahmed|arXiv (Cornell University)|Dec 13, 2023

Smart Agriculture and AI参考文献 106被引用数 8

ひとこと要約

研究は YOLOv8（一段階）と Mask R-CNN（二段階）を2つの果樹園データセットで比較し、両データセットで精度と速度の点で YOLOv8 が Mask R-CNN より優れていることを示す。

ABSTRACT

Instance segmentation is an important image processing operation for agricultural automation, providing precise delineation of individual objects within images and enabling tasks such as selective harvesting and precision pruning. This study compares the one stage YOLOv8 model with the two stage Mask R CNN model for instance segmentation under varying orchard conditions across two datasets. Dataset 1, collected in the dormant season, contains images of apple trees without foliage and was used to train multi object segmentation models delineating branches and trunks. Dataset 2, collected in the early growing season, includes canopy images with green foliage and immature apples and was used to train single object segmentation models delineating fruitlets. Results showed YOLOv8 outperformed Mask R CNN with higher precision and near perfect recall at a confidence threshold of 0.5. For Dataset 1, YOLOv8 achieved precision 0.90 and recall 0.95 compared to 0.81 and 0.81 for Mask R CNN. For Dataset 2, YOLOv8 reached precision 0.93 and recall 0.97 compared to 0.85 and 0.88. Inference times were also lower for YOLOv8, at 10.9 ms and 7.8 ms, versus 15.6 ms and 12.8 ms for Mask R CNN. These findings demonstrate superior accuracy and efficiency of YOLOv8 for real time orchard automation tasks such as robotic harvesting and fruit thinning.

研究の動機と目的

選果や精密剪定など果樹園の自動化タスクに向けた正確なインスタンス分割の動機づけ。
異なる果樹園条件下でのワンステージ YOLOv8 とツーステージ Mask R-CNN の性能評価。
リアルタイムの果樹園ロボット応用に適するかを判断するため、精度・再現率・推論時間を評価。
データセット特性（葉のない dormant-season の樹木 vs. 成長期のキャノピー）がモデル性能に与える影響を探る。

提案手法

ワンステージ YOLOv8 とツーステージ Mask R-CNN をインスタンス分割で比較。
Dataset 1（ dormant-season、葉を持たないリンゴ）を用いて枝と幹の複数物体分割モデルを訓練。
Dataset 2（早い成長期、葉と果実のつぼみがある）を用いて果実のつぶ分割モデルを単一物体分割として訓練。
信頼度閾値 0.5 での精度と再現率を評価。
両モデルの推論時間を ms 単位で測定。
データセット間の性能差を分析し、リアルタイムの果樹園自動化タスクに情報を提供。

実験結果

リサーチクエスチョン

RQ1 dormant-season の果樹園画像で同時物体分割に対して YOLOv8 は Mask R-CNN より高い精度と再現率を提供するか？
RQ2 葉と未成熟果実を含むキャノピー画像での単一物体果実分割において、YOLOv8 は Mask R-CNN より優れているか？
RQ3 これらの果樹園シナリオにおける推論時間は YOLOv8 と Mask R-CNN でどう異なるか？

主な発見

Dataset 1 では、YOLOv8 が precision 0.90、recall 0.95、これに対して Mask R-CNN は precision 0.81、recall 0.81。
Dataset 2 では、YOLOv8 が precision 0.93、recall 0.97、これに対して Mask R-CNN は precision 0.85、recall 0.88。
推論時間は YOLOv8 が 10.9 ms および 7.8 ms で、Mask R-CNN の 15.6 ms および 12.8 ms より短い。
YOLOv8 はロボット的収穫や果実間引きなどのリアルタイム果樹園自動化タスクにおいて、精度と効率の点で優れていることを示した。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。