QUICK REVIEW

[论文解读] Pooling is neither necessary nor sufficient for appropriate deformation stability in CNNs

Avraham Ruderman, Neil C. Rabinowitz|arXiv (Cornell University)|Apr 12, 2018

Adversarial Robustness in Machine Learning被引用 25

一句话总结

本文挑战了卷积神经网络（CNN）中池化层对于形变稳定性的必要性这一长期存在的假设。研究证明，形变稳定性源于训练过程中滤波器平滑性的学习，而非池化操作；相反，池化反而引入了过度的不变性，网络必须在后续训练中加以校正——这表明池化既非实现最优稳定性的必要条件，也非充分条件。

ABSTRACT

Many of our core assumptions about how neural networks operate remain empirically untested. One common assumption is that convolutional neural networks need to be stable to small translations and deformations to solve image recognition tasks. For many years, this stability was baked into CNN architectures by incorporating interleaved pooling layers. Recently, however, interleaved pooling has largely been abandoned. This raises a number of questions: Are our intuitions about deformation stability right at all? Is it important? Is pooling necessary for deformation invariance? If not, how is deformation invariance achieved in its absence? In this work, we rigorously test these questions, and find that deformation stability in convolutional networks is more nuanced than it first appears: (1) Deformation invariance is not a binary property, but rather that different tasks require different degrees of deformation stability at different layers. (2) Deformation stability is not a fixed property of a network and is heavily adjusted over the course of training, largely through the smoothness of the convolutional filters. (3) Interleaved pooling layers are neither necessary nor sufficient for achieving the optimal form of deformation stability for natural image classification. (4) Pooling confers too much deformation stability for image classification at initialization, and during training, networks have to learn to counteract this inductive bias. Together, these findings provide new insights into the role of interleaved pooling and deformation invariance in CNNs, and demonstrate the importance of rigorous empirical testing of even our most basic assumptions about the working of neural networks.

研究动机与目标

通过实证方法检验池化在图像分类任务中是否对卷积神经网络的形变稳定性具有必要性或充分性。
研究在有无池化层的网络中，形变稳定性在训练过程中的演化机制。
确定池化的归纳偏置是否有助于或阻碍图像分类任务中达到最优的形变稳定性。
探究在无池化条件下，滤波器平滑性在实现形变稳定性中的作用。
评估输入与标签的联合分布如何共同塑造各层中形变稳定性的最终模式。

提出的方法

设计了一类参数化且可调控的图像形变，包括仿射变换和薄板样条（thin-plate splines），以探测网络响应。
在CIFAR-10和ImageNet数据集上训练了含与不含交错池化层的CNN，分别在模型初始化和完整训练后测量其对形变的敏感性。
将形变稳定性量化为网络各层及不同架构下，输入形变引起的平均响应变化。
通过高斯滤波器对卷积核进行平滑处理，测量滤波器平滑度，并将其与形变稳定性进行相关性分析。
在随机标签上进行训练，以分离任务结构（P(Y|X)）与数据分布（P(X)）对形变稳定性模式的影响。
跨不同架构与训练策略，比较各层形变稳定性和滤波器平滑度，识别收敛模式。

实验结果

研究问题

RQ1在图像分类任务中，池化是否为卷积神经网络形变稳定性的必要条件？
RQ2池化是否足以实现图像识别任务中的最优形变稳定性？
RQ3在有无池化层的网络中，形变稳定性在训练过程中的演化路径如何？
RQ4在无池化条件下，滤波器平滑性在多大程度上促进形变稳定性？
RQ5输入数据分布与监督任务（标签结构）如何共同影响学习到的形变稳定性模式？

主要发现

无池化网络在初始化时对形变较为敏感，但通过训练过程中的滤波器平滑性学习到了形变稳定性。
交错池化层在初始化时赋予了过强的形变稳定性，该特性需在训练过程中被抵消，表明池化的归纳偏置对图像分类任务而言过强。
无论网络架构如何，含池化与不含池化的网络在各层的形变稳定性最终收敛至相似的结构。
滤波器平滑性是形变稳定性的主要驱动因素，滤波器越平滑，对形变的不变性越强。
在随机标签上训练时（无任务结构），形变稳定性模式主要受架构归纳偏置支配；但当存在真实任务时，任务结构会覆盖架构特异性偏置。
输入与标签的联合分布（P(X,Y)）对最终形变稳定性模式具有决定性影响，表明监督任务本身是稳定性的关键决定因素。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。