QUICK REVIEW

[논문 리뷰] On the Opportunities and Risks of Foundation Models

Rishi Bommasani, Drew A. Hudson|arXiv (Cornell University)|2021. 08. 16.

Domain Adaptation and Few-Shot Learning참고 문헌 1,095인용 수 2,137

한 줄 요약

광범위한 데이터로 학습되고 다양한 작업에 적응할 수 있는 기초 모델(foundation models)에 대한 포괄적 검토로, 그 능력, 등장, 동질화, 생태계, 사회적 영향에 초점을 맞추고, 거버넌스 및 연구 방향에 대한 고려를 덧붙인다.

ABSTRACT

AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities,and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature.

연구 동기 및 목표

기초 모델을 정의하고 AI의 패러다임 전환으로서의 연구 이유를 제시한다.
언어, 시각, 로봇공학, 추론 및 상호작용 전반에 걸친 능력을 분석한다.
기초 모델과 관련된 기술, 데이터, 시스템, 안전, 평가 및 이론적 측면을 검토한다.
헬스케어, 법률, 교육 분야의 응용과 더 넓은 사회적 영향을 논의한다.

제안 방법

컴퓨터 과학, 사회과학, 경제학, 윤리학의 학제간 관점을 통합하여 기초 모델을 특징짓는다.
데이터 생성에서 배포까지의 생태계를 기술하여 하류 효과를 추론한다.
출현하는 행동과 연구 및 실무에 대한 모델 동질화의 시사점을 논의한다.
거버넌스, 윤리, 학계와 산업계 간 협력에 대한 규범적 지침을 제공한다.
이해의 격차를 강조하고 향후 연구와 인프라의 방향을 제안한다.

실험 결과

연구 질문

RQ1다양한 모달리티(언어, 시각, 로봇공학, 추론, 상호작용)에서 기초 모델이 보여주는 능력은 무엇인가?
RQ2등장성과 동질화가 기초 모델의 능력, 위험 및 사회적 영향에 어떻게 작용하는가?
RQ3데이터 생성에서 배포에 이르는 기초 모델을 둘러싼 생태계는 무엇이며 거버넌스가 어디에 개입해야 하는가?
RQ4책임감 있게 기초 모델을 개발하고 배포하기 위해 필요한 규범과 제도적 배치(종종 학제간 협력이 요구됨)는 무엇인가?

주요 결과

기초 모델은 emergent 특성을 보이고 대규모에서 맥락 학습을 가능하게 하지만 명시적으로 학습되지는 않았다.
모델과 방법론의 강한 동질화가 넓은 전달 학습을 가능하게 하지만 공유된 실패 모드와 편향을 야기한다.
출현적 능력과 이러한 모델의 규모는 사회적, 윤리적, 환경적 우려를 제기하며, 신중한 거버넌스가 필요하다.
전체 사회적 영향은 훈련 단계뿐 아니라 데이터 생성, 선별, 학습, 적응, 배포 등 전체 생태계에 달려 있다.
학계와 산업계는 협력해야 하며, 학계는 다양한 학문적 관점을 제공하고 장기적 공익 고려를 제시한다.
모델과 학습 데이터가 점점 독점적이고 재현 비용이 높아짐에 따라 재현성과 공개성에 큰 장애가 있다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.