QUICK REVIEW

[论文解读] Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study

Samuel Ritter, David G. T. Barrett|arXiv (Cornell University)|Jun 26, 2017

Explainable Artificial Intelligence (XAI)参考文献 31被引用 61

一句话总结

本文将认知心理学探针应用于深度神经网络，以测试一词学习中的形状偏差；Inception和Matching Networks 展现出形状偏差，在不同随机种子和训练阶段存在较大方差，且偏差从输入传播到下游组件。

ABSTRACT

Deep neural networks (DNNs) have achieved unprecedented performance on a wide range of complex tasks, rapidly outpacing our understanding of the nature of their solutions. This has caused a recent surge of interest in methods for rendering modern neural systems more interpretable. In this work, we propose to address the interpretability problem in modern DNNs using the rich history of problem descriptions, theories and experimental methods developed by cognitive psychologists to study the human mind. To explore the potential value of these tools, we chose a well-established analysis from developmental psychology that explains how children learn word labels for objects, and applied that analysis to DNNs. Using datasets of stimuli inspired by the original cognitive psychology experiments, we find that state-of-the-art one shot learning models trained on ImageNet exhibit a similar bias to that observed in humans: they prefer to categorize objects according to shape rather than color. The magnitude of this shape bias varies greatly among architecturally identical, but differently seeded models, and even fluctuates within seeds throughout training, despite nearly equivalent classification performance. These results demonstrate the capability of tools from cognitive psychology for exposing hidden computational properties of DNNs, while concurrently providing us with a computational model for human word learning.

研究动机与目标

通过引入认知心理学的方法与假设，推动对DNN的可解释分析。
测试最先进的一-shot 学习模型是否表现出类似于人类的形状偏差。
在保持高分类准确率的同时，考察形状偏差在不同随机种子和训练过程中的变异性。
提出形状偏差可能作为人类一-shot 词汇学习的计算解释。

提出的方法

将认知心理学的形状偏差实验改编为适用于DNNs的探针数据集（CogPsyc），包含形状、颜色和探针图像三元组。
使用预训练的Inception特征和最近邻分类评估Inception Baseline (IB)一-shot分类器。
使用在ImageNet上训练的带注意力的嵌入和记忆模块的Matching Networks (MN)进行one-shot学习。
将形状偏差B_s计算为被形状匹配标记的探针比例，即B_s = E(δ(ŷ − y_s)).
在多种种子、数据集（CogPsyc 与真实世界数据）和训练阶段上评估偏差，以分析其出现与变异性。

实验结果

研究问题

RQ1在ImageNet上训练的最先进DNNs是否在一词学习任务中表现出类似人类的形状偏差？
RQ2形状偏差如何随初始化种子以及训练过程而变化？
RQ3所观察到的偏差在不同模型架构（Inception 与 Matching Networks）及输入特征之间是否一致？
RQ4在串联模型（IB 到 MN）时，形状偏差是否会在模型组件之间传播？

主要发现

Inception Baseline 在 CogPsyc 数据上表现出形状偏差 B_s = 0.68，在真实世界数据上为 B_s = 0.97。
Matching Networks 在 CogPsyc 数据上显示形状偏差 B_s = 0.7，在真实世界数据上为 B_s = 1。
形状偏差在不同种子之间差异显著（IB：训练末端平均 B_s = 0.628，SD 0.049；真实世界数据：平均 0.958，SD 0.037）。
IB 模型在训练早期、在收敛之前就出现了形状偏差。
MN 从其输入特征中继承了IB偏差，并在训练过程中保持不变（无显著变化）。
在IB的训练过程中偏差波动较大，但在MN中则不大，说明偏差在模块之间传播。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。