QUICK REVIEW

[論文レビュー] Confidence-Building Measures for Artificial Intelligence: Workshop Proceedings

Sarah Shoker, Andrew W. Reddie|arXiv (Cornell University)|Aug 1, 2023

Scientific Computing and Data Management被引用数 13

ひとこと要約

ワークショップは foundation models からのセキュリティリスクを緩和するための実践的な信頼構築措置を特定し、複数の利害関係者の関与と適応可能で拘束力のない行動を強調しています。

ABSTRACT

Foundation models could eventually introduce several pathways for undermining state security: accidents, inadvertent escalation, unintentional conflict, the proliferation of weapons, and the interference with human diplomacy are just a few on a long list. The Confidence-Building Measures for Artificial Intelligence workshop hosted by the Geopolitics Team at OpenAI and the Berkeley Risk and Security Lab at the University of California brought together a multistakeholder group to think through the tools and strategies to mitigate the potential risks introduced by foundation models to international security. Originating in the Cold War, confidence-building measures (CBMs) are actions that reduce hostility, prevent conflict escalation, and improve trust between parties. The flexibility of CBMs make them a key instrument for navigating the rapid changes in the foundation model landscape. Participants identified the following CBMs that directly apply to foundation models and which are further explained in this conference proceedings: 1. crisis hotlines 2. incident sharing 3. model, transparency, and system cards 4. content provenance and watermarks 5. collaborative red teaming and table-top exercises and 6. dataset and evaluation sharing. Because most foundation model developers are non-government entities, many CBMs will need to involve a wider stakeholder community. These measures can be implemented either by AI labs or by relevant government actors.

研究の動機と目的

foundation model era における misperception および escalation を防ぐための信頼構築措置 (CBMs) の必要性を動機付ける。
AI システムに適用可能な実行可能な CBMs のセットを、ラボ・政府・市民社会などの関係者全体に適用可能なものとして特定する。
CBMs が正式な規制と並行して急速な AI イノベーションを管理する方法を説明する。
CBMs の成功と採択に影響を与える政治的・技術的制約を強調する。
既存の AI ガバナンス・フレームワークへ CBMs を統合するための道筋を提案する。

提案手法

foundation models に適用される CBMs を特定する。危機時のホットライン、インシデント共有、モデル/システムカード、内容の出所証明と透かし、共同 Red Teaming、テーブルトップ演習、データ/評価共有を含む。
CBMs を4つのカテゴリーに整理する：コミュニケーションと協調、観測と検証、協力と統合、透明性。
CBMs の実装における非政府アクターとマルチステークホルダーの関与の役割を論じる。
歴史的および現代の国際安全保障の文脈からの例と留意点を示す。
検証上の課題、インセンティブの整合性、AI 能力の変化する性質に対応する適応可能でその場で構築するアプローチの必要性など、限界と継続的な Red-teaming およびガバナンスの整合性を評価する。

実験結果

リサーチクエスチョン

RQ1foundation models から生じる国際的安全保障リスクを緩和するのに最も適用可能な CBMs は何か。
RQ2多くの AI 開発者が非政府アクターであり、マルチステークホルダーの参加を必要とすることを考えると、CBMs はどのように実装できるか。
RQ3CBMs の実現可能性と効果に影響を与える政治的・技術的制約は何か。
RQ4CBMs は既存の国際的規制議論と枠組みをどのように補完できるか。

主な発見

foundation models に適用可能と識別された CBMs には、危機ホットライン、インシデント共有、モデル/透明性/システムカード、内容の出所証明と透かし、共同 Red Teaming、テーブルトップ演習、データセット/評価共有が含まれる。
CBMs はコミュニケーション/協調、観測/検証、協力/統合、透明性の4つのカテゴリーに分類され、誤認とエスカレーションを減らすよう設計されている。
多くの CBMs は自発的であり、AI ラボや政府アクターによる実施が可能で、開発者の多くが非政府性であることからマルチステークホルダーの関与が生じ得る。
CBMs には検証の難易度、インセンティブの整合、進化する AI 能力に対応する柔軟な“その場で構築する” アプローチの必要性といった政治的・技術的制約がある。
提案された CBMs は formal な規制努力を置換するものではなく、補完するものであり、信頼の低い国際環境における橋渡しとなる可能性がある。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。