QUICK REVIEW

[论文解读] The Foundation Model Transparency Index v1.1: May 2024

Rishi Bommasani, Kevin Klyman|arXiv (Cornell University)|Jul 17, 2024

Ethics and Social Impacts of AI被引用 7

一句话总结

FMTI v1.1 对 14 家基础模型开发者在 100 项透明度指标上的评估，显示比 v1.0 平均提升 21 点，原因是在开发者提交报告并披露新信息。

ABSTRACT

Foundation models are increasingly consequential yet extremely opaque. To characterize the status quo, the Foundation Model Transparency Index was launched in October 2023 to measure the transparency of leading foundation model developers. The October 2023 Index (v1.0) assessed 10 major foundation model developers (e.g. OpenAI, Google) on 100 transparency indicators (e.g. does the developer disclose the wages it pays for data labor?). At the time, developers publicly disclosed very limited information with the average score being 37 out of 100. To understand how the status quo has changed, we conduct a follow-up study (v1.1) after 6 months: we score 14 developers against the same 100 indicators. While in v1.0 we searched for publicly available information, in v1.1 developers submit reports on the 100 transparency indicators, potentially including information that was not previously public. We find that developers now score 58 out of 100 on average, a 21 point improvement over v1.0. Much of this increase is driven by developers disclosing information during the v1.1 process: on average, developers disclosed information related to 16.6 indicators that was not previously public. We observe regions of sustained (i.e. across v1.0 and v1.1) and systemic (i.e. across most or all developers) opacity such as on copyright status, data access, data labor, and downstream impact. We publish transparency reports for each developer that consolidate information disclosures: these reports are based on the information disclosed to us via developers. Our findings demonstrate that transparency can be improved in this nascent ecosystem, the Foundation Model Transparency Index likely contributes to these improvements, and policymakers should consider interventions in areas where transparency has not improved.

研究动机与目标

使用固定的一组指标，在供应链全环节（上游、模型、下游）衡量领先基础模型开发者的透明度。
将 v1.1 的结果与 v1.0 进行比较，以追踪六个月的进展。
评估开发者提交的披露如何影响分数，并识别持续存在的透明度低下领域。
发布开发者透明度报告，以实现可重复性和进一步研究。

提出的方法

在三个领域（上游、模型、下游）保留 FMTI v1.0 的 100 项指标。
向开发者征求其旗舰模型的透明度报告（14家开发者）。
两名研究者对每一对（指标，开发者）进行独立评分，达成约 85% 一致性，随后进行迭代质疑并由开发者进行最终验证。
公布汇总各开发者披露信息的透明度报告。
分析分数以识别领域/子领域的表现，并与 v1.0 进行比较。

Figure 1 : Scores by Domain. The overall scores disaggregated into the three domains: upstream, model, and downstream.

实验结果

研究问题

RQ1在 v1.1 版本中，当前基础模型开发者在上游、模型和下游领域的透明度如何？
RQ2自 v1.0 以来，透明度有多大提升，哪些领域提升最大？
RQ3哪些指标在大多数开发者中仍然持续不透明，公开发布与非公开发布策略与透明度之间有何关系？
RQ4开发者提交的透明度报告的提供是否会改变相较于公开信息的透明度解读？

主要发现

v1.1 的总体平均分从 v1.0 的 37/100 提升至 58/100。
开发者平均披露了前未公开的 16.6 项指标相关信息。
上游领域仍然最不透明（46%），下游为 65%，模型为 61%。
得分最高的子领域包括用户界面、能力和模型基础知识（下游）。
开放发布开发者整体表现优于关闭开发者，中位差距为 5.5 点，主要受上游透明度推动。
计算、数据劳动和风险是显著改进的子领域之一，而数据获取和数据相关指标仍然薄弱。
被评估的十四家开发者中有八家相较于 v1.0 有所提升，部分公司取得了较大收益（例如 AI21 Labs 约提升 50 点）。
v1.1 期间披露的新信息促成了有意义的分数提升，表明透明度是可实现的，并且可以通过报告取得进展。

Figure 2 : Scores by Major Dimensions of Transparency. The fraction of achieved indicators in each of the 13 major dimension of transparency. Major dimension of transparency are large subdomains within the 23 subdomains.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。