QUICK REVIEW

[논문 리뷰] What shapes feature representations? Exploring datasets, architectures, and training

Katherine L. Hermann, Andrew K. Lampinen|arXiv (Cornell University)|2020. 06. 22.

Domain Adaptation and Few-Shot Learning참고 문헌 42인용 수 54

한 줄 요약

이 논문은 신경망이 합성 데이터에서 특징의 유용성과 상관관계를 제어함으로써 특징 표현을 어떻게 형성하는지 조사하고, 특징의 강화 및 억제, 쉽고 해독하기 쉬운 특징에의 의존성, 모델 간 표현 유사성 패턴을 보인다.

ABSTRACT

In naturalistic learning problems, a model's input contains a wide range of features, some useful for the task at hand, and others not. Of the useful features, which ones does the model use? Of the task-irrelevant features, which ones does the model represent? Answers to these questions are important for understanding the basis of models' decisions, as well as for building models that learn versatile, adaptable representations useful beyond the original training task. We study these questions using synthetic datasets in which the task-relevance of input features can be controlled directly. We find that when two features redundantly predict the labels, the model preferentially represents one, and its preference reflects what was most linearly decodable from the untrained model. Over training, task-relevant features are enhanced, and task-irrelevant features are partially suppressed. Interestingly, in some cases, an easier, weakly predictive feature can suppress a more strongly predictive, but more difficult one. Additionally, models trained to recognize both easy and hard features learn representations most similar to models that use only the easy feature. Further, easy features lead to more consistent representations across model runs than do hard features. Finally, models have greater representational similarity to an untrained model than to models trained on a different task. Our results highlight the complex processes that determine which features a model represents.

연구 동기 및 목표

제어된 합성 데이터셋에서 학습이 타깃 특징과 비타깃 특징의 해독 가능성(decodability)을 어떻게 바꾸는지 결정한다.
레이블을 예측하는 다중 특징이 있을 때 학습 중에 모델이 특징을 강화하는지 아니면 억제하는지 확인한다.
특징 간 상관관계가 표현적 선택과 상관된 비타깃 특징의 억제에 어떤 영향을 미치는지 검토한다.
특징의 난이도와 학습가능성이 특징 선택과 표현 안정성에 어떤 영향을 주는지 평가한다.
아키텍처, 과제, 학습 여부(학습된 모델 vs 미학습 모델) 간 표현의 차이를 평가한다.

제안 방법

타깃 특징과 비타깃 특징(모양, 질감, 색상)을 제어할 수 있는 합성 비전 데이터셋을 만들고 AlexNet과 ResNet-50을 학습시켜 타깃 특징을 분류한다.
선형 디코더를 사용하여 계층 활성화를 특징 라벨로 매핑하고 학습 전후의 해독 가능성을 테스트한다.
레이어 전반에 걸친 타깃 및 비타깃 특징의 강화 대 억제를 평가하기 위해 디코딩 분석을 활용한다.
상관된 특징 데이터셋(Trifeature Correlated)과 이진 형태의 쉬운/어려운 특징 데이터셋을 만들어 중복성과 타협 관계를 연구한다.
Representational Similarity Analysis(RSA)을 적용해 모델, 과제, 아키텍처 및 학습 규범 간 표현을 비교한다.

실험 결과

연구 질문

RQ1학습이 계층과 아키텍처 전반에서 타깃 특징을 강화하고 비타깃 특징을 억제하는 방식은 무엇인가?
RQ2다중 특징이 중복으로 레이블을 예측할 때, 모델은 어떤 특징을 우선적으로 표현하며 그 이유는 무엇인가?
RQ3모델이 더 예측력이 높지만 학습하기 어려운 특징보다 학습하기 쉬운 특징을 선호하는가, 그리고 이것이 표현에 어떤 영향을 미치는가?
RQ4특징 간 상관관계가 상관된 비타깃 특징의 해독 가능성과 억제에 어떤 영향을 주는가?
RQ5같은 과제로 학습한 모델 간, 서로 다른 과제로 학습한 모델 간, 그리고 미학습 모델 간의 표현 유사성은 어떻게 되는가?

주요 결과

타깃 특징은 학습 후 해독 가능성이 증가하지만, 비타깃 특징은 억제되지만 완전히 제거되지는 않는다.
두 특징이 중복으로 레이블을 예측할 때, 모델은 한 특징을 다른 것보다 선호하며, 학습되지 않은 상태의 해독 가능도 순위(color > shape > texture)와 일치한다.
더 쉽게 예측력이 약한 특징이 더 강하게 예측되지만 학습하기 더 어려운 특징을 억제할 수 있다(게으른 학습).
쉬운 특징은 실행 간 표현이 더 일관되게 나오고, 멀티태스크 모델은 쉬운 특징으로 학습된 모델과 비슷하다.
표현 유사성은 더 쉬운 특징에 의해 지배된다; 같은 과제로 학습된 모델은 서로 비슷하고 다른 과제의 모델보다 유사도가 높으며, 미학습 모델은 때때로 과제 간 모델들보다 더 비슷하다.
미학습 표현은 작업 관련 구조를 상당 부분 포착하고 특징 해독 가능성 및 잠재적 사용 여부를 예측할 수 있다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.