QUICK REVIEW

[论文解读] PlantSeg: A Large-Scale In-the-wild Dataset for Plant Disease Segmentation

Tianqi Wei, Zhi Chen|arXiv (Cornell University)|Sep 6, 2024

Smart Agriculture and AI被引用 8

一句话总结

PlantSeg 引入一个大规模的野外数据集，包含对 115 种疾病在 34 种植物上的像素级疾病分割掩膜，便于在现实条件下对植物疾病分割进行基准测试。它对多种基线进行基准测试，并显示 SegNeXt 在该数据集上的强大性能。

ABSTRACT

Plant diseases pose significant threats to agriculture. It necessitates proper diagnosis and effective treatment to safeguard crop yields. To automate the diagnosis process, image segmentation is usually adopted for precisely identifying diseased regions, thereby advancing precision agriculture. Developing robust image segmentation models for plant diseases demands high-quality annotations across numerous images. However, existing plant disease datasets typically lack segmentation labels and are often confined to controlled laboratory settings, which do not adequately reflect the complexity of natural environments. Motivated by this fact, we established PlantSeg, a large-scale segmentation dataset for plant diseases. PlantSeg distinguishes itself from existing datasets in three key aspects. (1) Annotation type: Unlike the majority of existing datasets that only contain class labels or bounding boxes, each image in PlantSeg includes detailed and high-quality segmentation masks, associated with plant types and disease names. (2) Image source: Unlike typical datasets that contain images from laboratory settings, PlantSeg primarily comprises in-the-wild plant disease images. This choice enhances the practical applicability, as the trained models can be applied for integrated disease management. (3) Scale: PlantSeg is extensive, featuring 11,400 images with disease segmentation masks and an additional 8,000 healthy plant images categorized by plant type. Extensive technical experiments validate the high quality of PlantSeg's annotations. This dataset not only allows researchers to evaluate their image classification methods but also provides a critical foundation for developing and benchmarking advanced plant disease segmentation algorithms.

研究动机与目标

强调在鲁棒模型上需要高质量的野外植物病害分割数据。
描述 PlantSeg 数据集，包含 11,458 张图像和跨 34 种植物的 115 种疾病，以及详细的分割掩膜。
在 PlantSeg 上使用具有代表性的分割模型建立基线性能，并分析标注质量和数据集特征。

提出的方法

通过关键字驱动的收集，从网络获取图像，在 34 种植物和 115 种疾病的覆盖范围内构建数据集。
通过跨标注者验证和专家病理学家评审进行严格的数据清洗。
使用 LabelMe 的多边形掩膜进行详细的分割标注，包括处理重叠病变和畸变。
元数据整理包括植物、疾病、分辨率、掩膜比例，以及训练/测试拆分信息。
使用四种语义分割模型（SAN、DeepLabv3、DeepLabv3+、SegNeXt）及多种骨干网络进行基准测试。
使用 MIoU 和 mean accuracy (mAcc) 进行评估。

实验结果

研究问题

RQ1一个大规模的野外植物病害分割数据集是否能提高分割模型在田间条件下的泛化能力？
RQ2在复杂、真实世界的植物病害掩膜上，现有最先进的语义分割方法相较于实验室数据集的表现如何？
RQ3分割掩膜比例和图像分辨率对植物病理学中模型设计和评估提供了哪些见解？
RQ4骨干网络尺寸或模型结构是否显著影响野外植物病害分割的性能？
RQ5该数据集是否可作为开发植物病害分割方法的综合基准？

主要发现

方法	骨干网络	MIoU	mAcc
DeepLabv3	ResNet-50	17.24	37.95
DeepLabv3	ResNet-101	20.72	40.63
DeepLabv3+	ResNet-50	25.08	40.66
DeepLabv3+	ResNet-101	27.18	42.29
SAN	ViT-B/16	34.79	50.19
SAN	ViT-L/14	36.91	52.81
SegNeXt	MSCAN-L	44.52	59.95

PlantSeg 包含 11,458 张图像，覆盖 34 种植物的 115 种疾病类别，另有 8,000 张健康植物图像用于植物类型分类。
标注质量通过专家病理学家评审和跨标注者一致性检查进行验证。
SegNeXt 以 MSCAN-L 骨干在最高 MIoU（44.52%）和 mAcc（59.95%）方面达到最佳。
SAN 以 ViT-L/14 骨干优于 ViT-B/16，达到 MIoU 36.91% 和 mAcc 52.81%。
DeepLabv3 和 DeepLabv3+ 使用更深的骨干（ResNet-101）后表现提升，但在 PlantSeg 上仍落后于 SegNeXt。
基线结果表明通过使用更大的骨干和多尺度卷积方法在真实病变掩膜上可获得显著提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。