QUICK REVIEW

[논문 리뷰] Language as a Cognitive Tool to Imagine Goals in Curiosity-Driven Exploration

Cédric Colas, Tristan Karch|arXiv (Cornell University)|2020. 02. 21.

Child and Animal Learning Development참고 문헌 77인용 수 56

한 줄 요약

이 논문은 내재적으로 동기가 부여된 RL 아키텍처인 imagine을 소개합니다. 이는 언어를 사용해 분포 외 목표를 상상하고 탐색을 안내하며, 모듈식의 객체 중심 표현과 Playground 환경에서의 사회적 언어 피드백으로 가능해집니다.

ABSTRACT

Developmental machine learning studies how artificial agents can model the way children learn open-ended repertoires of skills. Such agents need to create and represent goals, select which ones to pursue and learn to achieve them. Recent approaches have considered goal spaces that were either fixed and hand-defined or learned using generative models of states. This limited agents to sample goals within the distribution of known effects. We argue that the ability to imagine out-of-distribution goals is key to enable creative discoveries and open-ended learning. Children do so by leveraging the compositionality of language as a tool to imagine descriptions of outcomes they never experienced before, targeting them as goals during play. We introduce IMAGINE, an intrinsically motivated deep reinforcement learning architecture that models this ability. Such imaginative agents, like children, benefit from the guidance of a social peer who provides language descriptions. To take advantage of goal imagination, agents must be able to leverage these descriptions to interpret their imagined out-of-distribution goals. This generalization is made possible by modularity: a decomposition between learned goal-achievement reward function and policy relying on deep sets, gated attention and object-centered representations. We introduce the Playground environment and study how this form of goal imagination improves generalization and exploration over agents lacking this capacity. In addition, we identify the properties of goal imagination that enable these results and study the impacts of modularity and social interactions.

연구 동기 및 목표

외부 보상 없이 개방형 기술 레퍼토리를 학습하도록 자율 에이전트를 동기부여하기 위해 목표를 상상한다.
구성적 언어를 통해 분포 밖 목표 생성을 가능하게 하여 창의적 탐색을 촉진한다.
사회적 언어 지도가 모듈식 아키텍처가 목표 해석 및 정책 학습을 지원하는 방법을 연구한다.
predicates, 속성, 객체 범주 간 일반화를 분석하기 위한 통제된 환경(Playground)을 제공한다.

제안 방법

자 natural language 목표를 임베딩으로 매핑하는 언어 인코더를 갖춘 imagine 아키텍처를 도입한다.
목표 달성 보상 함수와 목표 조건부 정책의 두 내부 모델을 개발한다.
KNOWN and imagined goals를 혼합하는 목표 생성기를 사용하고, 상상은 구성 문법에 기반해 새로운 목표를 구성한다.
퍼뮤테이션 불변 표현을 얻기 위해 게이트된 주의가 있는 Deep Sets 같은 객체 중심 모듈형 아키텍처를 사용한다.
Hindsight Replay와 공유 언어 인코더를 통해 설명을 학습 신호로 번역한다.

실험 결과

연구 질문

RQ1언어를 이용한 목표 상상이 새로운 상태와 언어로 설명된 목표에 대한 일반화에 어떤 영향을 미치는가?
RQ2상상된 목표가 특히 객체 상호 작용에서 환경 탐색에 미치는 영향은?
RQ3모듈식 아키텍처와 사회적 언어 피드백이 상상된 목표로부터 학습하는 능력에 어떻게 영향을 미치는가?

주요 결과

목표 상상은 테스트 세트에서 보지 못한 목표에 대한 일반화를 상당히 향상시킨다.
에이전트는 imagined goals에 대응해 물을 주는 등 행동을 조정하여 행동적 적응을 보인다.
상상은 목표 지향적 상호작용(i2c) 증가로 측정된 탐색을 촉진한다.
모듈성(게이트된 주의가 있는 객체 중심 Deep Sets)은 상상된 목표를 활용하고 평면 아키텍처보다 더 나은 일반화를 달성하는 데 결정적이다.
파트너의 설명적 사회 피드백은 느슨한 피드백 조건에서도 효과적인 목표 상상을 가능하게 한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.