QUICK REVIEW

[논문 리뷰] Passive learning of active causal strategies in agents and language models

Andrew K. Lampinen, Stephanie C. Y. Chan|arXiv (Cornell University)|2023. 05. 25.

Topic Modeling인용 수 10

한 줄 요약

본 논문은 에이전트와 언어 모델이 순수하게 수동 데이터로 일반화 가능한 인과 실험 및 중재 전략을 학습할 수 있으며, 특히 설명이 제공될 때 이를 확장하여 새로운 구조와 고차원 환경에 적용할 수 있음을 보여준다.

ABSTRACT

What can be learned about causality and experimentation from passive data? This question is salient given recent successes of passively-trained language models in interactive domains such as tool use. Passive learning is inherently limited. However, we show that purely passive learning can in fact allow an agent to learn generalizable strategies for determining and using causal structures, as long as the agent can intervene at test time. We formally illustrate that learning a strategy of first experimenting, then seeking goals, can allow generalization from passive learning in principle. We then show empirically that agents trained via imitation on expert data can indeed generalize at test time to infer and use causal links which are never present in the training data; these agents can also generalize experimentation strategies to novel variable sets never observed in training. We then show that strategies for causal intervention and exploitation can be generalized from passive data even in a more complex environment with high-dimensional observations, with the support of natural language explanations. Explanations can even allow passive learners to generalize out-of-distribution from perfectly-confounded training data. Finally, we show that language models, trained only on passive next-word prediction, can generalize causal intervention strategies from a few-shot prompt containing examples of experimentation, together with explanations and reasoning. These results highlight the surprising power of passive learning of active causal strategies, and may help to understand the behaviors and capabilities of language models.

연구 동기 및 목표

수동적이고 오프라인인 데이터가 일반화 가능한 능동적 인과 전략 학습을 지원할 수 있음을 보여준다.
전문가 데이터에 대한 모방이 보이지 않는 인과 구조로의 외삽을 가능하게 함을 보인다.
설명이 수동 학습자들의 학습 및 일반화에 어떻게 도움을 주는지, 분포를 벗어나는 경우를 포함해 검토한다.
적은 샷 프롬프트를 사용한 수동적 다음 단어 예측으로 언어 모델이 인과적 중재 전략을 획득할 수 있는지 평가한다.

제안 방법

메모리를 갖춘 트랜스포머 기반 에이전트를 사용하여 간단한 인과 DAG 및 이상-단일 환경에서 전문가 시연을 모방한다.
탐색이 인과 구조를 식별하는 전문가 데이터에 대해 행동 cloning으로 학습하고, 이후 학습된 구조를 활용한 활용으로 진행한다.
보이지 않는 인과 링크를 가진 DAG 및 고차원 픽셀 기반 관측에 대한 일반화를 평가한다.
학습 및 일반화를 촉진하기 위해 보조 손실로 자연어 설명을 포함한다.
설명을 포함한 몇 샷 프롬프트로 70B 파라미터의 Chinchilla 언어 모델을 테스트하여 인과 전략 일반화를 평가한다.

실험 결과

연구 질문

RQ1테스트 시 개입할 수 있다면, 에이전트가 수동 데이터로부터 인과 구조를 발견하고 활용하는 일반화 가능한 전략을 학습할 수 있는가?
RQ2수동적 모방이 얼마나 범위까지 보이지 않는 인과 링크와 변수 집합에 일반화될 수 있는가?
RQ3자연어 설명이 수동 학습자의 일반화 능력, 특히 분포를 벗어난 사례를 포함해 향상시키는가?
RQ4사전 학습된 언어 모델이 설명이 있는 few-shot 프롬프트로 인과적 중재 전략을 일반화할 수 있는가?

주요 결과

전문가 데이터에 대한 모방으로 학습된 에이전트가 학습 데이터에 없던 인과 링크를 추론하고 사용할 수 있다.
에이전트가 탐색 전략을 새로운 변수 집합 및 테스트 구조로 일반화한다.
설명은 혼합된 데이터로부터 수동 학습자가 일반화하도록 하고 고차원 환경에서의 학습을 뒷받침한다.
설명 및 추론 흔적이 포함될 때 언어 모델이 몇 샷 프롬프트로 인과적 중재 전략을 일반화할 수 있다.
수동으로 훈련된 에이전트와 LMs는 설명으로부터 이점을 얻어 순수 관찰 학습을 넘어 일반화를 향상시킨다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.