QUICK REVIEW

[論文レビュー] Plant identification in an open-world (LifeCLEF 2016)

Hervé Goëau, Pierre Bonnet|ArXiv.org|Sep 25, 2025

Smart Agriculture and AI参考文献 9被引用数 53

ひとこと要約

LifeCLEF 2016 の植物同定タスクは、未知分類を拒否する課題を強調しつつ、CNNベースのシステムを比較し、1000種の西ヨーロッパ植物を含む110,000枚を超える画像でオープンセット認識を評価しました。

ABSTRACT

The LifeCLEF plant identification challenge aims at evaluating plant identification methods and systems at a very large scale, close to the conditions of a real-world biodiversity monitoring scenario. The 2016-th edition was actually conducted on a set of more than 110K images illustrating 1000 plant species living in West Europe, built through a large-scale participatory sensing platform initiated in 2011 and which now involves tens of thousands of contributors. The main novelty over the previous years is that the identification task was evaluated as an open-set recognition problem, i.e. a problem in which the recognition system has to be robust to unknown and never seen categories. Beyond the brute-force classification across the known classes of the training set, the big challenge was thus to automatically reject the false positive classification hits that are caused by the unknown classes. This overview presents more precisely the resources and assessments of the challenge, summarizes the approaches and systems employed by the participating research groups, and provides an analysis of the main outcomes.

研究の動機と目的

実世界の生物多様性モニタリングに近いオープンセット条件下で大規模に植物識別手法を評価する。
既知種を識別しつつ未知・未出現の植物カテゴリへの頑健性を評価する。
オープンセット性能と未知クラス拒否を研究するためのベンチマークデータセットと指標を提供する。
干渉物が多いテストセットの下で、異なるCNNベースおよびハイブリッド手法の性能を分析する。

提案手法

学習用データを PlantCLEF 2015 由来でテスト画像の真偽を付与して拡張する。
既知クラスと未知クラスを含む Pl@ntNet クエリからテストセットを構築する（オープンセット）。
オープンセット設定における平均適合度 (mean Average Precision) の指標 mAP-open および侵略的種 monitoring を対象としたバリアント mAP-open-invasive で提出物を評価する。
CNN および非 CNN のベースライン、アンサンブルおよびメタデータの活用を含め、グループごとに最大4回の試行を許可する。
未知クラスの拒否戦略を評価し、異なる新規性レベルでの性能を報告する。

実験結果

リサーチクエスチョン

RQ1CNN ベースの植物同定システムは、多くの未知クラスが存在するオープンワールド環境でどの程度機能するか？
RQ2未知クラスの干渉がオープンセットの mAP にどのような影響を与えるか？
RQ3明示的な未知クラス拒否戦略は頑健性を向上させるか、どの新規性条件下で？
RQ4新規性の割合が増加するストリーミングのような状況で、性能はどの程度低下するか？
RQ5オープンセット植物識別の性能におけるアーキテクチャ、アンサンブル、メタデータの相対的寄与度は？

主な発見

Run	Key-words	Rejection	mAP-open	mAP-open-invasive	mAP-closed
Bluefield Run4	VGGNet, combine outputs from a same observation	thresholds by class (train+validation)	0.742	0.717	0.827
SabanciU GebzeTU Run1	2x(VGGNet,GoogleNet) tuned with resp. 70k, 115k training images	GoogleNet 70k/70k Plant/ImageNet	0.738	0.704	0.806
SabanciU…Run3	SabanciUGebzeTU Run1	Manually removed 90 test images	0.737	0.703	0.807
Bluefield Run3	Bluefield Run 4	thresholds by class	0.736	0.718	0.82
SabanciU…Run2	SabanciUGebzeTU Run1	-	0.736	0.683	0.807
SabanciU…Run4	SabanciUGebzeTU Run1	-	0.735	0.695	0.802
CMP Run1	Bagging of 3xResNet-152	-	0.71	0.653	0.79
LIIR KUL Run3	CaffeNet, VGGNet16, 3xGoogleNet, adding 12k external plant images	threshold	0.703	0.674	0.761
LIIR KUL Run2	LIIR KUL Run 3	threshold	0.692	0.667	0.744
LIIR KUL Run1	LIIR KUL Run 3	threshold	0.669	0.652	0.708
UM Run4	VGGNet16	-	0.669	0.598	0.742
CMP Run2	ResNet-152	-	0.644	0.564	0.729
CMP Run3	ResNet-152 (2015training)	-	0.639	0.59	0.723
QUT Run3	1 ”general” GoogleNet, 6 ”organ” GoogleNets, observation combination	-	0.629	0.61	0.696
Floristic Run3	GoogleNet, metadata	-	0.627	0.533	0.693
UM Run1	VGGNet16	-	0.627	0.537	0.7
Floristic Run1	GoogleNet	-	0.619	0.541	0.694
Bluefield Run1	VGGNet	thresholds by class	0.611	0.6	0.692
Bluefield Run2	VGGNet	thresholds by class	0.611	0.6	0.693
Floristic Run2	GoogleNet	thresholds by class	0.611	0.538	0.681
QUT Run1	GoogleNet	-	0.601	0.563	0.672
UM Run3	VGGNet16 with dedicated and combined organ & species layers	-	0.589	0.509	0.652
QUT Run2	6 ”organ” GoogleNets, observation combination	-	0.564	0.562	0.641
UM Run2	VGGNet16 from scratch (without ImageNet2012)	-	0.481	0.446	0.552
QUT Run4	QUT Run3	threshold	0.367	0.359	0.378
BMETMITRun4	AlexNet & BVWs & metadata	-	0.174	0.144	0.213
BMETMITRun3	AlexNet & BVWs & metadata	threshold by classifier	0.17	0.125	0.197
BMETMITRun1	AlexNet	-	0.169	0.125	0.196
BMETMITRun2	BVWs (fisher vectors)	-	0.066	0.128	0.101

CNN ベースのシステムが上位結果を支配しており、上位26件は CNN を使用。
侵入種モニタリングにおける mAP-open のベスト設定は 0.718 で、観測レベルのプーリングからの利得が主な要因。
オープンセットの干渉は全システムの性能を低下させるが、CNN は未知クラスに対して比較的頑健。
新規性が高い場合、平均適合度は大幅に低下する（例：クエリの 25% のみが既知の場合は 0.45 未満）。
拒否戦略は moderate novelity 下で CNN ベースの基準に比べて限定的な追加利益を提供し、適応型オープンセット拒否手法の余地を示唆する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。