Skip to main content
QUICK REVIEW

[论文解读] Plant identification in an open-world (LifeCLEF 2016)

Hervé Goëau, Pierre Bonnet|ArXiv.org|Sep 25, 2025
Smart Agriculture and AI参考文献 9被引用 53
一句话总结

LifeCLEF 2016 植物识别任务在超过11万张图像、覆盖1000种西欧植物中评估开放集识别,比较基于CNN的系统并突出在拒绝未知类别上的挑战。

ABSTRACT

The LifeCLEF plant identification challenge aims at evaluating plant identification methods and systems at a very large scale, close to the conditions of a real-world biodiversity monitoring scenario. The 2016-th edition was actually conducted on a set of more than 110K images illustrating 1000 plant species living in West Europe, built through a large-scale participatory sensing platform initiated in 2011 and which now involves tens of thousands of contributors. The main novelty over the previous years is that the identification task was evaluated as an open-set recognition problem, i.e. a problem in which the recognition system has to be robust to unknown and never seen categories. Beyond the brute-force classification across the known classes of the training set, the big challenge was thus to automatically reject the false positive classification hits that are caused by the unknown classes. This overview presents more precisely the resources and assessments of the challenge, summarizes the approaches and systems employed by the participating research groups, and provides an analysis of the main outcomes.

研究动机与目标

  • 在接近真实世界生物多样性监测的开放集条件下对大规模植物识别方法进行评估。
  • 在识别已知物种的同时评估对未知和未看到的植物类别的鲁棒性。
  • 提供基准数据集和指标,用于研究开放集性能和未知类别的拒绝。
  • 分析不同的基于CNN和混合方法在干扰因子丰富的测试集上的表现。

提出的方法

  • 使用来自 PlantCLEF 2015 的训练集,并为测试图像补充真实标签。
  • 从 Pl@ntNet 查询中构建一个测试集,包含已知和未知类(开放集)。
  • 以开放集设置下的平均精确度均值(mAP-open)及聚焦入侵物种的变体(mAP-open-invasive)评估提交结果。
  • 每组允许最多4次运行,包括CNN和非CNN基线、集成模型以及元数据的使用。
  • 评估对未知类别的拒绝策略,并在不同新颖程度下报告表现。

实验结果

研究问题

  • RQ1基于CNN的植物识别系统在存在大量未知类别的开放世界设置中表现如何?
  • RQ2未知类别干扰项对开放集植物识别的mAP有何影响?
  • RQ3明确的未知类别拒绝策略是否提高鲁棒性?在何种新颖性条件下?
  • RQ4在类似流式场景中,当未知查询比例增加时,性能如何下降?
  • RQ5架构、集成和元数据在开放集植物识别性能中的相对贡献是多少?

主要发现

运行关键词拒绝策略mAP-openmAP-open-invasivemAP-closed
Bluefield Run4VGGNet, combine outputs from a same observationthresholds by class (train+validation)0.7420.7170.827
SabanciU GebzeTU Run12x(VGGNet,GoogleNet) tuned with resp. 70k, 115k training imagesGoogleNet 70k/70k Plant/ImageNet0.7380.7040.806
SabanciU…Run3SabanciUGebzeTU Run1Manually removed 90 test images0.7370.7030.807
Bluefield Run3Bluefield Run 4thresholds by class0.7360.7180.82
SabanciU…Run2SabanciUGebzeTU Run1-0.7360.6830.807
SabanciU…Run4SabanciUGebzeTU Run1-0.7350.6950.802
CMP Run1Bagging of 3xResNet-152-0.710.6530.79
LIIR KUL Run3CaffeNet, VGGNet16, 3xGoogleNet, adding 12k external plant imagesthreshold0.7030.6740.761
LIIR KUL Run2LIIR KUL Run 3threshold0.6920.6670.744
LIIR KUL Run1LIIR KUL Run 3threshold0.6690.6520.708
UM Run4VGGNet16-0.6690.5980.742
CMP Run2ResNet-152-0.6440.5640.729
CMP Run3ResNet-152 (2015training)-0.6390.590.723
QUT Run31 ”general” GoogleNet, 6 ”organ” GoogleNets, observation combination-0.6290.610.696
Floristic Run3GoogleNet, metadata-0.6270.5330.693
UM Run1VGGNet16-0.6270.5370.7
Floristic Run1GoogleNet-0.6190.5410.694
Bluefield Run1VGGNetthresholds by class0.6110.60.692
Bluefield Run2VGGNetthresholds by class0.6110.60.693
Floristic Run2GoogleNetthresholds by class0.6110.5380.681
QUT Run1GoogleNet-0.6010.5630.672
UM Run3VGGNet16 with dedicated and combined organ & species layers-0.5890.5090.652
QUT Run26 ”organ” GoogleNets, observation combination-0.5640.5620.641
UM Run2VGGNet16 from scratch (without ImageNet2012)-0.4810.4460.552
QUT Run4QUT Run3threshold0.3670.3590.378
BMETMITRun4AlexNet & BVWs & metadata-0.1740.1440.213
BMETMITRun3AlexNet & BVWs & metadatathreshold by classifier0.170.1250.197
BMETMITRun1AlexNet-0.1690.1250.196
BMETMITRun2BVWs (fisher vectors)-0.0660.1280.101
  • CNN-based systems dominated the top results, with the top 26 runs using CNNs.
  • Best configuration achieved mAP-open 0.718 for invasive-species monitoring, with gains mainly from observation-level pooling.
  • Open-set distractors degrade performance across all systems; however, CNNs remain relatively robust to unknown classes.
  • When novelty is high, mean average precision drops significantly (e.g., below 0.45 when only 25% of queries are known).
  • Rejection strategies provided limited additional benefits over CNN baselines under moderate novelty, suggesting room for adaptive open-set rejection methods.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。