[论文解读] PotatoGANs: Utilizing Generative Adversarial Networks, Instance Segmentation, and Explainable AI for Enhanced Potato Disease Identification and Classification
PotatoGANs 使用 CycleGAN 和 Pix2Pix 从健康土豆图像生成患病图像,扩增数据以提高土豆病害检测效果,将 XAI 与多种 CNN 结合以提升可解释性,并使用 Detectron2 进行实例分割。它通过 Inception Score 显示更高的图像质量,并在 Detectron2 下的 Dice 分数为 0.8112。
Numerous applications have resulted from the automation of agricultural disease segmentation using deep learning techniques. However, when applied to new conditions, these applications frequently face the difficulty of overfitting, resulting in lower segmentation performance. In the context of potato farming, where diseases have a large influence on yields, it is critical for the agricultural economy to quickly and properly identify these diseases. Traditional data augmentation approaches, such as rotation, flip, and translation, have limitations and frequently fail to provide strong generalization results. To address these issues, our research employs a novel approach termed as PotatoGANs. In this novel data augmentation approach, two types of Generative Adversarial Networks (GANs) are utilized to generate synthetic potato disease images from healthy potato images. This approach not only expands the dataset but also adds variety, which helps to enhance model generalization. Using the Inception score as a measure, our experiments show the better quality and realisticness of the images created by PotatoGANs, emphasizing their capacity to resemble real disease images closely. The CycleGAN model outperforms the Pix2Pix GAN model in terms of image quality, as evidenced by its higher IS scores CycleGAN achieves higher Inception scores (IS) of 1.2001 and 1.0900 for black scurf and common scab, respectively. This synthetic data can significantly improve the training of large neural networks. It also reduces data collection costs while enhancing data diversity and generalization capabilities. Our work improves interpretability by combining three gradient-based Explainable AI algorithms (GradCAM, GradCAM++, and ScoreCAM) with three distinct CNN architectures (DenseNet169, Resnet152 V2, InceptionResNet V2) for potato disease classification.
研究动机与目标
- 解决在新条件下土豆病害分割的过拟合和泛化能力不足。
- 开发基于 GAN 的数据增强管道,从健康土豆图像合成真实的病变土豆图像。
- 通过将 GradCAM、GradCAM++ 和 ScoreCAM 与多种 CNN 架构结合,提高分类的可解释性。
- 利用 Detectron2 实现更高精度的土豆病害分割与定位。
提出的方法
- 使用 CycleGAN 和 Pix2Pix 将健康土豆图像转换为病变对应物以用于数据增强。
- 创建一个由生成图像和手动标注图像组成的数据集,并由孟加拉国农业研究院 (BARI) 验证。
- 用 Fréchet Inception Distance 和 Inception Score 评估生成图像质量。
- 将 GradCAM、GradCAM++ 和 ScoreCAM 与 DenseNet169、ResNet152V2 和 InceptionResNetV2 结合用于可解释的病害分类。
- 应用以 ResNeXt-101 为骨干的 Detectron2 进行实例分割,并报告 Dice 和 IoU 指标。

实验结果
研究问题
- RQ1基于 GAN 的图像生成是否能提升土豆病害分类与定位的泛化能力?
- RQ2XAI 方法如何可视化并验证卷积神经网络分类器对土豆病害的决策过程?
- RQ3基于 Detectron2 的实例分割是否在生成数据上提升病害定位的准确性?
- RQ4哪些 CNN 架构最能从综合 XAI 方法中获益用于土豆病害分类?
主要发现
- CycleGAN 在图像质量方面优于 Pix2Pix,基于黑皮病(IS=1.2001)和普通瘟疮(IS=1.0900)的 Inception Score。
- 生成的合成数据有助于大型神经网络的训练并降低数据采集成本。
- 使用 ResNeXt-101 骨干的 Detectron2 的 Dice 分数达到 0.8112。
- 三种基于梯度的 XAI 方法(GradCAM、GradCAM++、ScoreCAM)与三种 CNN(DenseNet169、ResNet152V2、InceptionResNetV2)的结合提供了对模型决策的更深层次可解释性。
- 扩展的数据集,结合生成与分割,有助于提高土豆作物的病害识别与定位。

更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。