QUICK REVIEW

[论文解读] Deformable Convolutional Networks

Jifeng Dai, Haozhi Qi|arXiv (Cornell University)|Mar 17, 2017

Advanced Neural Network Applications参考文献 41被引用 434

一句话总结

引入可变形卷积和可变形RoI池化，以实现CNN中的密集、输入感知的空间变换，在检测和分割任务上取得更好性能，且开销不高。

ABSTRACT

Convolutional neural networks (CNNs) are inherently limited to model geometric transformations due to the fixed geometric structures in its building modules. In this work, we introduce two new modules to enhance the transformation modeling capacity of CNNs, namely, deformable convolution and deformable RoI pooling. Both are based on the idea of augmenting the spatial sampling locations in the modules with additional offsets and learning the offsets from target tasks, without additional supervision. The new modules can readily replace their plain counterparts in existing CNNs and can be easily trained end-to-end by standard back-propagation, giving rise to deformable convolutional networks. Extensive experiments validate the effectiveness of our approach on sophisticated vision tasks of object detection and semantic segmentation. The code would be released.

研究动机与目标

激励理解CNN中固定几何结构如何限制对几何变换的建模。
提出可变形卷积，以从数据中学习密集采样偏移。
提出可变形RoI池化，以使池化区域自适应对象形状。
表明可变形模块可以替代简单对比对象，并且可以端到端训练。

提出的方法

可变形卷积在常规采样网格上增添每个位置的可学习二维偏移。
偏移由并行卷积层产生，并通过反向传播在双线性插值中端到端学习。
可变形RoI池化向RoI池化箱添加可学习偏移，并对分数位置使用双线性插值。
可变形PS RoI池化通过类别特定得分图和全卷积偏移学习扩展可变形RoI池化。
实验将可变形模块与ResNet-101和Aligned-Inception-ResNet骨干网络集成，覆盖分割和检测流程。

实验结果

研究问题

RQ1可学习的空间偏移是否能够使CNN在没有手工设计模块的情况下建模大型或非刚性几何变换？
RQ2可变形卷积和可变形RoI池化是否在标准基准测试中提升分割和目标检测任务的性能？
RQ3使用双线性插值的端到端训练是否足以在密集预测场景中学习有意义的偏移？

主要发现

可变形模块使感受野自适应，与对象大小和形状相关。
增加可变形RoI池化提升定位，尤其是对非刚性对象。
结合使用可变形卷积和可变形RoI池化，相较于普通CNN在分割和检测基准上取得显著提升。
偏移端到端学习，通常较小，且零初始化确保训练初期干扰最小。
可变形ConvNets仅引入适度的额外参数和计算，但带来显著的准确性提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。