QUICK REVIEW

[论文解读] The Good, the Bad and the Ugly: Evaluating Convolutional Neural Networks for Prohibited Item Detection Using Real and Synthetically Composited X-ray Imagery

Neelanjan Bhowmik, Qian Wang|arXiv (Cornell University)|Sep 9, 2019

Advanced X-ray and CT Imaging被引用 13

一句话总结

本研究评估了使用真实数据和合成拼接数据，基于ResNet-101主干网络的Faster R-CNN在X光安检图像中检测枪支、枪支零部件和刀具的性能。在真实数据上达到0.88 mAP，在合成数据上达到0.78 mAP，表明尽管存在性能差距，合成图像可作为多样化训练数据的可行替代方案。

ABSTRACT

Detecting prohibited items in X-ray security imagery is pivotal in maintaining border and transport security against a wide range of threat profiles. Convolutional Neural Networks (CNN) with the support of a significant volume of data have brought advancement in such automated prohibited object detection and classification. However, collating such large volumes of X-ray security imagery remains a significant challenge. This work opens up the possibility of using synthetically composed imagery, avoiding the need to collate such large volumes of hand-annotated real-world imagery. Here we investigate the difference in detection performance achieved using real and synthetic X-ray training imagery for CNN architecture detecting three exemplar prohibited items, {Firearm, Firearm Parts, Knives}, within cluttered and complex X-ray security baggage imagery. We achieve 0.88 of mean average precision (mAP) with a Faster R-CNN and ResNet-101 CNN architecture for this 3-class object detection using real X-ray imagery. While the performance is comparable with synthetically composited X-ray imagery (0.78 mAP), our extended evaluation demonstrates both challenge and promise of using synthetically composed images to diversify the X-ray security training imagery for automated detection algorithm training.

研究动机与目标

评估卷积神经网络在使用合成拼接X光图像进行禁止物品检测中的性能。
比较真实与合成X光训练数据在三种关键威胁类别（枪支、枪支零部件和刀具）上的检测准确率。
研究使用合成图像减少对大规模人工标注真实世界X光数据依赖的可行性。
评估在X光安检应用中使用合成数据增强模型多样性与鲁棒性的挑战与潜力。

提出的方法

在包含杂乱行李场景的真实X光安检图像上，使用ResNet-101主干网络训练Faster R-CNN模型。
通过将禁止物品（枪支、零部件、刀具）数字插入真实行李扫描图像，生成合成拼接的X光图像。
采用数据增强技术增加合成样本的多样性，提升泛化潜力。
在相同评估协议下，使用平均精度均值（mAP）评估模型在三个目标类别上的性能。
应用迁移学习与微调策略，以优化在真实与合成训练集上的性能。

实验结果

研究问题

RQ1模型在真实X光图像与合成拼接X光图像上的禁止物品检测性能有何差异？
RQ2在复杂杂乱的行李场景中，合成X光数据在多大程度上能保持与真实数据相当的检测准确率？
RQ3在X光安检应用中使用合成数据训练CNN时面临哪些关键挑战？
RQ4合成数据能否有效多样化训练数据并提升自动化禁止物品检测模型的泛化能力？

主要发现

当在真实X光图像上进行训练和评估时，模型实现了0.88的平均精度均值（mAP）。
当在合成拼接X光图像上进行训练时，模型实现了0.78的mAP，表明存在显著但可接受的性能差距。
合成图像显示出能够多样化训练数据并减少对大规模标注真实世界图像依赖的潜力。
性能差距表明，合成数据仍需进一步优化，以匹配真实X光扫描的逼真度与特征保真度。
扩展评估结果证实了合成数据在提升自动化检测系统训练数据多样性方面的挑战与潜力。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。