QUICK REVIEW

[논문 리뷰] Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study

Samuel Ritter, David G. T. Barrett|arXiv (Cornell University)|2017. 06. 26.

Explainable Artificial Intelligence (XAI)참고 문헌 31인용 수 61

한 줄 요약

이 논문은 형태 편향(shape bias)을 테스트하기 위해 심리 인지 탐색(cognitive psychology probing)을 심층 신경망에 적용하여 한-shot 단어 학습에서 형태 편향을 확인합니다; Inception과 Matching Networks는 형태 편향을 보이며, 시드 간 및 학습 중 불변성에 강한 차이가 있으며 입력에서 다운스트림 구성요소로 편향이 전이됩니다.

ABSTRACT

Deep neural networks (DNNs) have achieved unprecedented performance on a wide range of complex tasks, rapidly outpacing our understanding of the nature of their solutions. This has caused a recent surge of interest in methods for rendering modern neural systems more interpretable. In this work, we propose to address the interpretability problem in modern DNNs using the rich history of problem descriptions, theories and experimental methods developed by cognitive psychologists to study the human mind. To explore the potential value of these tools, we chose a well-established analysis from developmental psychology that explains how children learn word labels for objects, and applied that analysis to DNNs. Using datasets of stimuli inspired by the original cognitive psychology experiments, we find that state-of-the-art one shot learning models trained on ImageNet exhibit a similar bias to that observed in humans: they prefer to categorize objects according to shape rather than color. The magnitude of this shape bias varies greatly among architecturally identical, but differently seeded models, and even fluctuates within seeds throughout training, despite nearly equivalent classification performance. These results demonstrate the capability of tools from cognitive psychology for exposing hidden computational properties of DNNs, while concurrently providing us with a computational model for human word learning.

연구 동기 및 목표

인지 심리학 방법과 가설을 도입해 DNN의 해석가능한 분석을 고무한다.
최신 한-shot 학습 모델들이 인간과 유사한 형태 편향을 보이는지 평가한다.
높은 분류 정확도를 유지하면서 시드 간 및 학습 중 형태 편향의 변동성을 확인한다.
형태 편향이 인간의 한-shot 단어 학습을 설명하는 계산적 계기로 작용할 수 있음을 제안한다.

제안 방법

형태 편향 실험을 DNN에 맞게 적응시켜 형태-색상-프로브 이미지 삼중항으로 구성된 프로브 데이터셋(CogPsyc)을 만든다.
사전 학습된 Inception 특징을 최근접 이웃 분류로 사용하는 Inception Baseline (IB) 한-shot 분류기를 평가한다.
ImageNet에서 학습된 주의(attention) 기반 임베딩 및 메모리 모듈을 가진 Matching Networks (MN)을 사용해 한-shot 학습을 수행한다.
형태 일치에 의해 라벨링된 프로브의 비율로 형태 편향 B_s를 계산한다, 즉 B_s = E(δ(ŷ − y_s)).
형태 편향의 출현과 변동성을 분석하기 위해 여러 시드, 데이터셋(CogPsyc 및 실제 세계 데이터) 및 학습 단계 전반에 걸쳐 편향을 평가한다.

실험 결과

연구 질문

RQ1ImageNet에서 학습된 최첨단 DNN들이 한-shot 단어 학습 과제에서 인간과 유사한 형태 편향을 보이는가?
RQ2형태 편향은 초기화 시드에 따라 그리고 학습 중에 어떻게 달라지는가?
RQ3관찰된 편향이 모델 아키텍처(Inception 대 Matching Networks) 및 입력 특징 간에 일관적인가?
RQ4IB에서 MN으로 모델을 연결할 때 형태 편향이 모델 구성요소 간에 전이되는가?

주요 결과

Inception Baseline은 CogPsyc 데이터에서 형태 편향 B_s = 0.68을 보이고 실제 세계 데이터에서 B_s = 0.97을 보인다.
Matching Networks는 CogPsyc 데이터에서 형태 편향 B_s = 0.7, 실제 세계 데이터에서 B_s = 1을 보인다.
형태 편향은 시드에 따라 상당히 달라진다(IB: 학습 말의 평균 B_s = 0.628, 표준편차 0.049; 실제 세계: 평균 0.958, 표준편차 0.037).
IB 모델의 경우 편향은 수렴 이전에 학습 초기에 나타난다.
MN은 입력 특징에서 IB 편향을 상속받아 학습 중에도 유지된다(유의한 변화 없음).
학습 중 IB 내 편향 변동은 크지만 MN에서는 그렇지 않으며 모듈 간 편향 전이를 보여준다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.