QUICK REVIEW

[논문 리뷰] The Foundation Model Transparency Index v1.1: May 2024

Rishi Bommasani, Kevin Klyman|arXiv (Cornell University)|2024. 07. 17.

Ethics and Social Impacts of AI인용 수 7

한 줄 요약

FMTI v1.1은 14개 기초모델 개발자를 100개 투명성 지표로 평가하고, 개발자 보고서 제출과 새로운 정보 공개로 v1.0 대비 평균 21포인트 향상을 보였다.

ABSTRACT

Foundation models are increasingly consequential yet extremely opaque. To characterize the status quo, the Foundation Model Transparency Index was launched in October 2023 to measure the transparency of leading foundation model developers. The October 2023 Index (v1.0) assessed 10 major foundation model developers (e.g. OpenAI, Google) on 100 transparency indicators (e.g. does the developer disclose the wages it pays for data labor?). At the time, developers publicly disclosed very limited information with the average score being 37 out of 100. To understand how the status quo has changed, we conduct a follow-up study (v1.1) after 6 months: we score 14 developers against the same 100 indicators. While in v1.0 we searched for publicly available information, in v1.1 developers submit reports on the 100 transparency indicators, potentially including information that was not previously public. We find that developers now score 58 out of 100 on average, a 21 point improvement over v1.0. Much of this increase is driven by developers disclosing information during the v1.1 process: on average, developers disclosed information related to 16.6 indicators that was not previously public. We observe regions of sustained (i.e. across v1.0 and v1.1) and systemic (i.e. across most or all developers) opacity such as on copyright status, data access, data labor, and downstream impact. We publish transparency reports for each developer that consolidate information disclosures: these reports are based on the information disclosed to us via developers. Our findings demonstrate that transparency can be improved in this nascent ecosystem, the Foundation Model Transparency Index likely contributes to these improvements, and policymakers should consider interventions in areas where transparency has not improved.

연구 동기 및 목표

공급망 전 영역(상류, 모델, 하류)에 걸친 고정된 지표 세트를 사용하여 선두적인 기초 모델 개발자의 투명성을 측정한다.
6개월 간의 진행 상황을 추적하기 위해 v1.1 결과를 v1.0과 비교한다.
개발자가 제출한 공시가 점수에 미치는 영향을 평가하고 지속적으로 불투명한 영역을 식별한다.
재현 가능성과 추가 연구를 가능하게 하도록 개발자 투명성 보고서를 공개한다.

제안 방법

FMTI v1.0의 100개 지표를 상류, 모델, 하류의 세 도메인에 걸쳐 유지한다.
주요 모델(14개 개발자)에 대해 개발자로부터 투명성 보고서를 요청한다.
두 연구자가 각 (지표, 개발자) 쌍을 독립적으로 점수화하여 약 85%의 일치를 도출하고, 그 후 반복 반박과 개발자에 의한 최종 검증을 거친다.
각 개발자에 대해 공개된 정보를 통합한 투명성 보고서를 게시한다.
점수를 분석하여 도메인/하위 도메인 성능을 파악하고 v1.0과 비교한다.

Figure 1 : Scores by Domain. The overall scores disaggregated into the three domains: upstream, model, and downstream.

실험 결과

연구 질문

RQ1v1.1에서 현재 기초 모델 개발자들의 상류, 모델, 하류 도메인 전반의 투명성은 어느 정도인가?
RQ2v1.0 이후 투명성의 향상 정도는 어느 정도이며, 어떤 도메인이 가장 많이 개선되었나?
RQ3대부분의 개발자들에게 지속적으로 불투명한 지표는 어떤 것이며, 공개 출시 대 비공개 출시 전략이 투명성과 어떤 관련이 있나?
RQ4개발자 제출 투명성 보고서의 제공이 공개 출처 정보와 비교하여 투명성 해석에 변화를 주는가?

주요 결과

v1.1에서 평균 종합 점수가 58/100으로 상승했고, v1.0의 37/100에서 올랐다.
개발자들이 이전에 공개되지 않았던 16.6개의 지표와 관련된 정보를 평균적으로 공개했다.
상류 도메인이 여전히 가장 불투명(46%), 하류 65%, 모델 61%.
가장 높은 점수를 받은 하위 도메인에는 사용자 인터페이스, 능력, 모델 기초(하류)가 포함된다.
개방형 공개 개발자가 폐쇄형 개발자보다 전반적으로 우수하며 중앙값 차이는 5.5점으로, 주로 상류 투명성에 의해 주도된다.
계산, 데이터 노동, 위험은 눈에 띄는 개선이 있는 하위 도메인이며, 데이터 접근 및 데이터 관련 지표는 여전히 미약하다.
평가된 14개 개발자 중 8개가 v1.0 대비 개선되었고, 일부 기업은 큰 폭의 향상을 보였다(예: AI21 Labs 약 50포인트 증가).
v1.1 기간에 공개된 새로운 정보가 의미 있는 점수 상승에 기여하여 투명성이 가능하며 보고를 통해 진전을 이룰 수 있음을 시사한다.

Figure 2 : Scores by Major Dimensions of Transparency. The fraction of achieved indicators in each of the 13 major dimension of transparency. Major dimension of transparency are large subdomains within the 23 subdomains.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.