QUICK REVIEW

[论文解读] On the importance of single directions for generalization

Ari S. Morcos, David G. T. Barrett|arXiv (Cornell University)|Mar 19, 2018

Advanced Vision and Imaging参考文献 19被引用 195

一句话总结

记忆化网络更依赖单一激活方向；泛化与减少对单一方向的依赖相关。批量归一化降低这种依赖，类别选择性不是单位重要性的好预测指标。

ABSTRACT

Despite their ability to memorize large datasets, deep neural networks often achieve good generalization performance. However, the differences between the learned solutions of networks which generalize and those which do not remain unclear. Additionally, the tuning properties of single directions (defined as the activation of a single unit or some linear combination of units in response to some input) have been highlighted, but their importance has not been evaluated. Here, we connect these lines of inquiry to demonstrate that a network's reliance on single directions is a good predictor of its generalization performance, across networks trained on datasets with different fractions of corrupted labels, across ensembles of networks trained on datasets with unmodified labels, across different hyperparameters, and over the course of training. While dropout only regularizes this quantity up to a point, batch normalization implicitly discourages single direction reliance, in part by decreasing the class selectivity of individual units. Finally, we find that class selectivity is a poor predictor of task importance, suggesting not only that networks which generalize well minimize their dependence on individual units by reducing their selectivity, but also that individually selective units may not be necessary for strong network performance.

研究动机与目标

研究网络的泛化性能是否与其在激活空间对单一方向的依赖相关。
检查对单一方向的扰动（消融）如何影响在不同标签污染和架构下训练的网络。
评估诸如 dropout 和批量归一化等正则化器如何影响对单一方向的依赖。
评估单一方向的类别选择性是否能预测其对网络输出的重要性。

提出的方法

将单一方向定义为对输入的响应中单个神经元的激活或线性组合的激活。
对激活空间进行消融，通过将选定方向夹紧为零来测量在不同方向子集上的性能下降。
向神经元添加高斯噪声以测试对随机方向的依赖，噪声按单位激活方差放缩。
使用受神经科学启发的类别选择性指数来量化单位对各类别的选择性响应。
比较在不同架构上训练的、标签被污染与未污染的数据集的网络（在 MNIST 上的多层感知机、在 CIFAR-10 上的卷积神经网络、在 ImageNet 上的 ResNet）。
分析批量归一化和 dropout 对单一方向依赖和类别选择性的影响。

实验结果

研究问题

RQ1与结构学习泛化相比，记忆化是否会增加网络对单一激活方向的依赖？
RQ2在没有验证集的情况下，单一方向的依赖是否可以作为泛化、早停或超参数选择的代理？
RQ3dropout 和批量归一化如何影响对单一方向的依赖以及单位的类别选择性？
RQ4类别选择性是否是单位对网络输出重要性的可靠预测指标？

主要发现

记忆化网络对单一方向的累积消融更敏感，表现为对性能下降的敏感度高于能泛化的网络。
泛化更好的网络对单一方向的依赖更少，这种关系在不同架构和带有/未带有标签污染的标签下都存在。
批量归一化降低对单一方向的依赖，并降低单个特征图的类别选择性，同时提高互信息。
Dropout 可以延迟记忆化，但在超出训练时的 dropout 率后，未完全防止对单一方向的依赖。
单一方向的类别选择性并不能很好地预测它们对网络输出的重要性；高度选择性的单位并不总是更具影响力。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。