QUICK REVIEW

[논문 리뷰] Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain)

Mariya Toneva, Leila Wehbe|arXiv (Cornell University)|2019. 05. 28.

Topic Modeling참고 문헌 45인용 수 142

한 줄 요약

본 논문은 신경망 단어 표현을 인간 뇌 활동과 정렬시켜 NLP 모델(E LMo, USE, BERT, Transformer-XL)을 해석하는 방법을 제안하고 뇌 정렬 수정이 구문 이해를 향상시킬 수 있음을 보인다. 맥락 길이, 층 깊이, 그리고 어텐션이 모델 전반의 뇌 예측성에 미치는 영향을 분석하고, 뇌 유도 변화의 NLP 태스크로의 전이를 시연한다.

ABSTRACT

Neural networks models for NLP are typically implemented without the explicit encoding of language rules and yet they are able to break one performance record after another. This has generated a lot of research interest in interpreting the representations learned by these networks. We propose here a novel interpretation approach that relies on the only processing system we have that does understand language: the human brain. We use brain imaging recordings of subjects reading complex natural text to interpret word and sequence embeddings from 4 recent NLP models - ELMo, USE, BERT and Transformer-XL. We study how their representations differ across layer depth, context length, and attention type. Our results reveal differences in the context-related representations across these models. Further, in the transformer models, we find an interaction between layer depth and context length, and between layer depth and attention type. We finally hypothesize that altering BERT to better align with brain recordings would enable it to also better understand language. Probing the altered BERT using syntactic NLP tasks reveals that the model with increased brain-alignment outperforms the original model. Cognitive neuroscientists have already begun using NLP networks to study the brain, and this work closes the loop to allow the interaction between NLP and cognitive neuroscience to be a true cross-pollination.

연구 동기 및 목표

자연스러운 독해 중 인간 뇌 활동을 이용해 신경망 NLP 표현을 해석하는 동기를 제시한다.
네트워크 표현을 fMRI/MEG 데이터와 정렬시키는 데이터 기반 방법을 개발하여 모델이 무엇을 인코딩하는지 평가한다.
네 가지 모델(ELMo, USE, BERT, T-XL) 간 단어 표현 및 맥락 길이 표현을 뇌 용어로 비교한다.
맥락 길이, 층 깊이, 어텐션 유형이 모델 간 뇌 정렬 예측성에 어떤 영향을 미치는지 규명한다.
뇌 정렬 수정이 BERT에 적용되어 구문 태스크 성능 향상으로 전이될 수 있음을 시연한다.

제안 방법

같은 텍스트와 단어 창에 대해 네 가지 NLP 모델(ELMo, BERT, USE, T-XL)에서 중간 계층 표현 x_{l,k}를 추출한다.
x_{l,k}로부터 MEG/fMRI 활동을 예측하기 위해 릿지 정규화를 갖는 선형 인코딩 모델을 맞추고 예측 정확도를 평가한다.
단어 집합 분류 태스크를 통해 뇌 예측성을 평가하기 위해 4-겹 교차 검증과 보류 테스트 플랜을 사용한다.
선행 연구를 바탕으로 뇌 언어 네트워크를 두 그룹(group 1, group 2)으로 나누어 표현이 정렬되는 위치를 해석한다.
맥락 길이가 1단어 임베딩과 다단어(예: 10단어) 표현을 비교하고 층별 효과를 분석하여 영향을 조사한다.
BERT의 어텐션 패턴을 변경한다(한 층에서의 균일 어텐션) 뇌 예측성의 변화를 평가하고 NLP 구문 태스크로의 전이를 확인한다.
교정된 BERT를 Marvin & Linzen 구문 태스크에 대해 평가하여 미세 조정 없이 구문 이해를 테스트한다.

실험 결과

연구 질문

RQ1ELMo, BERT, USE 및 Transformer-XL의 중간 표현이 자연스러운 읽기 중 뇌 활성과 어떻게 정렬되는가?
RQ2레이어 깊이, 맥락 길이, 어텐션 유형이 이들 모델 전반의 뇌 정렬 예측성에 어떤 영향을 미치는가?
RQ3추가 학습 없이도 BERT에 대한 뇌 정렬 변경이 프로빙 태스크에서 구문 이해를 향상시킬 수 있는가?

주요 결과

중간 트랜스포머 층이 다른 층보다 뇌 활성 예측에 더 잘 작용한다.
Transformer-XL의 성능은 다른 모델과 달리 더 긴 맥락에서 저하되지 않는다.
얕은 BERT 층에서의 균일한 어텐션은 최대 25단어의 맥락에 대해 뇌 예측성을 향상시키지만, 깊은 층은 이 변화로 해를 입는다.
얕은 층에서 사전 학습된 어텐션을 제거하도록 BERT를 수정하면 뇌 데이터와의 정렬이 향상되고 구문 프로빙 태스크에서 더 좋은 성능을 낸다.
ELMo, BERT, T-XL의 장거리 표현은 그룹 1과 그룹 2 뇌 영역 모두의 활동을 예측하는 반면, USE는 주로 장거리 정보를 예측하고 그룹-1 영역은 더 적다.
모델 전반에 걸쳐 중간 층이 15단어를 넘어서는 맥락을 최적으로 통합한다; BERT의 layer1은 토큰 임베딩을 다르게 결합하여 맥락 유지에 영향을 준다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.