QUICK REVIEW

[论文解读] Exploring the Origins and Prevalence of Texture Bias in Convolutional Neural Networks.

Katherine L. Hermann, Simon Kornblith|arXiv (Cornell University)|Nov 20, 2019

Adversarial Robustness in Machine Learning参考文献 28被引用 14

一句话总结

本文研究了为何在ImageNet上训练的CNN模型表现出强烈的纹理分类偏好，尽管其在训练数据中存在形状与纹理冲突时仍能学习基于形状的分类。研究发现，数据增强——尤其是更自然、更温和的训练增强方式——能显著降低纹理偏差，使模型多数情况下基于形状进行分类，并提升分布外泛化能力。

ABSTRACT

Recent work has indicated that, unlike humans, ImageNet-trained CNNs tend to classify images by texture rather than by shape. How pervasive is this bias, and where does it come from? We find that, when trained on datasets of images with conflicting shape and texture, CNNs learn to classify by shape at least as easily as by texture. What factors, then, produce the texture bias in CNNs trained on ImageNet? Different unsupervised training objectives and different architectures have small but significant and largely independent effects on the level of texture bias. However, all objectives and architectures still lead to models that make texture-based classification decisions a majority of the time, even if shape information is decodable from their hidden representations. The effect of data augmentation is much larger. By taking less aggressive random crops at training time and applying simple, naturalistic augmentation (color distortion, noise, and blur), we train models that classify ambiguous images by shape a majority of the time, and outperform baselines on out-of-distribution test sets. Our results indicate that apparent differences in the way humans and ImageNet-trained CNNs process images may arise not primarily from differences in their internal workings, but from differences in the data that they see.

研究动机与目标

探究ImageNet训练CNN中观察到的纹理偏差的根本原因，该现象与人类视觉感知更依赖形状的特点形成对比。
确定CNN是否本质上无法学习基于形状的分类，还是偏差源于训练数据和训练过程。
评估不同训练目标、网络架构和数据增强策略对纹理偏差的相对影响。
确定即使模型做出基于纹理的预测，形状信息是否仍保留在隐藏表征中。
通过有效的数据增强减少纹理偏差，以提升分布外图像识别任务的泛化能力。

提出的方法

在包含形状与纹理线索冲突的数据集上训练CNN，以评估其学习基于形状分类的能力。
评估多种无监督训练目标和网络架构，以测量其对纹理偏差的独立影响。
应用一系列数据增强策略，包括较不激进的随机裁剪、色彩失真、噪声和模糊。
通过测量在形状-纹理模糊图像中基于纹理的预测比例，量化纹理偏差。
在分布外测试集上评估模型性能，以衡量泛化能力的提升。
分析隐藏表征，以确定尽管基于纹理进行预测，形状信息是否仍可解码。

实验结果

研究问题

RQ1当在形状与纹理线索冲突的数据上训练时，CNN在多大程度上能够学习基于形状进行分类？
RQ2不同的无监督训练目标和网络架构在多大程度上影响CNN中的纹理偏差水平？
RQ3与网络架构或训练目标的选择相比，数据增强在减少纹理偏差方面的相对影响是什么？
RQ4即使模型做出基于纹理的预测，其隐藏表征中是否仍存在可解码的形状信息？
RQ5通过数据增强减少纹理偏差，是否能提升在分布外图像识别任务中的泛化能力？

主要发现

在形状-纹理冲突数据上训练的CNN，能够以至少与基于纹理分类相当的容易程度实现基于形状的分类，表明其与基于形状的学习之间不存在本质不兼容性。
不同的无监督目标和架构对纹理偏差有微小但独立且具有统计显著性的影响，但均无法完全消除偏差。
激进的数据增强，尤其是随机裁剪，强烈促进纹理偏差，而较不激进的裁剪则能减少偏差。
应用自然主义增强（如色彩失真、噪声和模糊）可使模型在多数情况下基于形状对模糊图像进行分类。
通过有效数据增强训练的模型在分布外测试集上的表现优于基线模型，表明泛化能力得到提升。
尽管做出基于纹理的预测，这些模型的隐藏表征中仍包含可解码的形状信息，表明偏差并非源于特征丢失。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。