QUICK REVIEW

[論文レビュー] The Foundation Model Transparency Index v1.1: May 2024

Rishi Bommasani, Kevin Klyman|arXiv (Cornell University)|Jul 17, 2024

Ethics and Social Impacts of AI被引用数 7

ひとこと要約

FMTI v1.1 は 100 の透明性指標で14の基盤モデル開発者を評価し、開発者が報告を提出し新情報を開示したことにより v1.0 から平均で 21 ポイント改善を示した。

ABSTRACT

Foundation models are increasingly consequential yet extremely opaque. To characterize the status quo, the Foundation Model Transparency Index was launched in October 2023 to measure the transparency of leading foundation model developers. The October 2023 Index (v1.0) assessed 10 major foundation model developers (e.g. OpenAI, Google) on 100 transparency indicators (e.g. does the developer disclose the wages it pays for data labor?). At the time, developers publicly disclosed very limited information with the average score being 37 out of 100. To understand how the status quo has changed, we conduct a follow-up study (v1.1) after 6 months: we score 14 developers against the same 100 indicators. While in v1.0 we searched for publicly available information, in v1.1 developers submit reports on the 100 transparency indicators, potentially including information that was not previously public. We find that developers now score 58 out of 100 on average, a 21 point improvement over v1.0. Much of this increase is driven by developers disclosing information during the v1.1 process: on average, developers disclosed information related to 16.6 indicators that was not previously public. We observe regions of sustained (i.e. across v1.0 and v1.1) and systemic (i.e. across most or all developers) opacity such as on copyright status, data access, data labor, and downstream impact. We publish transparency reports for each developer that consolidate information disclosures: these reports are based on the information disclosed to us via developers. Our findings demonstrate that transparency can be improved in this nascent ecosystem, the Foundation Model Transparency Index likely contributes to these improvements, and policymakers should consider interventions in areas where transparency has not improved.

研究の動機と目的

サプライチェーン全体の固定指標セット（上流、モデル、下流）を用いて主要な基盤モデル開発者の透明性を測定する。
v1.1 の結果を v1.0 と比較して6か月間の進捗を追跡する。
開発者提出の開示情報がスコアに与える影響を評価し、持続する不透明領域を特定する。
再現性とさらなる研究を可能にするために開発者の透明性レポートを公表する。

提案手法

FMTI v1.0 の 100 指標を三つのドメイン（上流、モデル、下流）にわたって維持する。
旗艦モデルに関する透明性レポートを開発者（14 社）から請求する。
二名の研究者が各（指標、開発者）ペアを独立して採点し 85% 程度の一致を得た後、反論と最終検証を開発者が行う。
各開発者の開示情報を統合した透明性レポートを公表する。
スコアを分析してドメイン/サブドメインのパフォーマンスを特定し、v1.0 と比較する。

Figure 1 : Scores by Domain. The overall scores disaggregated into the three domains: upstream, model, and downstream.

実験結果

リサーチクエスチョン

RQ1v1.1 における上流・モデル・下流ドメインでの現在の基盤モデル開発者の透明性はどの程度か？
RQ2v1.0 以降どれだけ透明性が改善され、どのドメインが最も改善したか？
RQ3ほとんどの開発者にわたって継続的に不透明とされる指標はどれで、オープン公開対閉鎖公開戦略は透明性とどう関係するか？
RQ4開発者提出の透明性レポートの提供は、公開情報と比較して透明性の解釈を変えるか？

主な発見

v1.1 では総合平均が 58/100 に上昇し、v1.0 の 37/100 から改善された。
開発者は公開前には情報が公開されていなかった 16.6 指標に関する情報を平均して開示した。
上流ドメインは最も不透明で 46%、下流は 65%、モデルは 61%。
最高スコアを記録したサブドメインにはユーザーインターフェース、能力、モデルの基本要素（下流）が含まれる。
オープン公開の開発者は閉鎖開発者より全体で優れており、中央値の差は 5.5 ポイントで、主に上流の透明性によって推進された。
計算、データ労働、リスクは顕著な改善が見られるサブドメインの一つであり、データアクセスとデータ関連指標は依然として弱い。
評価対象の 14 社のうち 8 社は v1.0 と比べて改善を示し、AI21 Labs など一部の企業は大きな gains を示した（例: 約50 ポイント上昇）。
v1.1 で開示された新情報は意味のあるスコアの上昇に寄与しており、透明性は実現可能で報告によって進展することを示している。

Figure 2 : Scores by Major Dimensions of Transparency. The fraction of achieved indicators in each of the 13 major dimension of transparency. Major dimension of transparency are large subdomains within the 23 subdomains.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。