[论文解读] Compositional Convolutional Networks For Robust Object Classification under Occlusion.
本文提出一种混合模型,结合深度卷积神经网络(DCNNs)与组合式物体模型,以在部分遮挡和掩码攻击下实现鲁棒的物体分类。通过利用DCNN特征进行初始分类和不确定性检测,并在遮挡情况下应用学习到的基于部件的组合式模型,该方法在非遮挡图像上保持高精度,同时显著提升了对遮挡的鲁棒性,且在训练过程中无需使用遮挡数据。
Deep convolutional neural networks (DCNNs) are powerful models that yield impressive results at object classification. However, recent work has shown that they do not generalize well to partially occluded objects and to mask attacks. In contrast to DCNNs, compositional models are robust to partial occlusion, however, they are not as discriminative as deep models. In this work, we combine DCNNs and compositional object models to retain the best of both approaches: a discriminative model that is robust to partial occlusion and mask attacks. Our model is learned in two steps. First, a standard DCNN is trained for image classification. Subsequently, we cluster the DCNN features into dictionaries. We show that the dictionary components resemble object part detectors and learn the spatial distribution of parts for each object class. We propose mixtures of compositional models to account for large changes in the spatial activation patterns (e.g. due to changes in the 3D pose of an object). At runtime, an image is first classified by the DCNN in a feedforward manner. The prediction uncertainty is used to detect partially occluded objects, which in turn are classified by the compositional model. Our experimental results demonstrate that combining compositional models and DCNNs resolves a fundamental problem of current deep learning approaches to computer vision: The combined model recognizes occluded objects, even when it has not been exposed to occluded objects during training, while at the same time maintaining high discriminative performance for non-occluded objects.
研究动机与目标
- 解决深度卷积神经网络(DCNNs)在部分遮挡物体和对抗性掩码攻击下泛化能力差的问题。
- 结合DCNN的判别能力与组合模型的遮挡鲁棒性。
- 即使在训练过程中未见过此类样本,也能实现对遮挡物体的准确分类。
- 使用组合模型的混合形式,对不同3D姿态下的物体部件激活的空间变化进行建模。
- 在推理阶段利用DCNN的预测不确定性检测遮挡,并切换至组合模型以实现鲁棒分类。
提出的方法
- 首先训练一个标准的DCNN用于图像分类,以生成特征图。
- 将训练好的DCNN的特征聚类为词典,其中各组件类似于物体部件检测器。
- 从聚类后的特征中学习每类物体的部件空间分布。
- 使用组合组件的混合模型,以处理因3D姿态变化导致的部件激活模式的大幅变化。
- 在推理阶段,DCNN执行前向传播分类;预测不确定性用于识别潜在遮挡。
- 对遮挡物体使用组合模型进行重新分类,该模型利用部件检测器和空间先验信息。
实验结果
研究问题
- RQ1结合DCNN与组合模型的混合模型是否能在不使用遮挡训练样本的情况下,提升对部分遮挡的鲁棒性?
- RQ2如何利用DCNN中的预测不确定性在推理阶段检测遮挡物体?
- RQ3具有学习到的部件检测器和空间先验的组合模型,对未见过的遮挡模式的泛化能力如何?
- RQ4组合组件的混合模型是否能有效建模因3D姿态变化导致的部件激活空间变化?
- RQ5该联合模型是否在保持非遮挡图像高判别性能的同时,提升了对遮挡图像的鲁棒性?
主要发现
- 联合模型在非遮挡图像上实现了高精度,保留了基础DCNN的判别能力。
- 即使在训练过程中未出现遮挡样本,模型仍能成功对遮挡物体进行分类。
- 利用预测不确定性可可靠检测遮挡实例,并触发组合模型进行重新分类。
- 结合学习到的部件检测器和空间先验的组合模型,显著提升了对掩码攻击和部分遮挡的鲁棒性。
- 组合组件的混合模型能有效捕捉不同3D姿态下部件激活的空间变化。
- 该方法解决了当前计算机视觉深度学习模型在遮挡条件下泛化能力差的根本性局限。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。