QUICK REVIEW

[論文レビュー] An Economic Solution to Copyright Challenges of Generative AI

Jiachen T. Wang, Zhun Deng|arXiv (Cornell University)|Apr 22, 2024

Law, AI, and Intellectual Property被引用数 5

ひとこと要約

本論文は、Shapley-value に基づくロイヤリティ機構を提案し、生成系AIの学習に使用されるデータに対して著作権者を公平に補償しつつ、モデル推論を変更しない。提案手法が寄与を正確に帰属させ、解釈可能な収益分配を生み出すことを実証する。

ABSTRACT

Generative artificial intelligence (AI) systems are trained on large data corpora to generate new pieces of text, images, videos, and other media. There is growing concern that such systems may infringe on the copyright interests of training data contributors. To address the copyright challenges of generative AI, we propose a framework that compensates copyright owners proportionally to their contributions to the creation of AI-generated content. The metric for contributions is quantitatively determined by leveraging the probabilistic nature of modern generative AI models and using techniques from cooperative game theory in economics. This framework enables a platform where AI developers benefit from access to high-quality training data, thus improving model performance. Meanwhile, copyright owners receive fair compensation, driving the continued provision of relevant data for generative model training. Experiments demonstrate that our framework successfully identifies the most relevant data sources used in artwork generation, ensuring a fair and interpretable distribution of revenues among copyright owners.

研究の動機と目的

モデルの性能を維持しつつ、生成系AIのための公正でデータ駆動の著作権ソリューションの必要性を動機づける。
学習時のデータ貢献度に比例して著作権保有者へ補償する経済的枠組みを提案する。
データ所有者間で収益を分配する Shapley-value-based royalty share (SRS) 方法論を開発する。
割り当ての解釈性と公正性を示すため、アートデータとロゴデータセットの実験でこの枠組みを実証する。

提案手法

データの一部集合で訓練したモデルの下で生成された内容の対数尤度をデータ効用として定義する。
Shapley value を用いて総効用を著作権者間で配分する（SRS）。
Shapley値が負の場合、分配式の分子・分母双方を 0 に設定して扱う。
Shapley値推定の計算手法を検討する（Monte Carlo、順列サンプリング、ファインチューニング近似）。
実務的な収益分配の調整として、データ開発者分割 beta_data と、開発者の関与をモデル化する許可構造フレームワークを導入する。
著作権帰属とロイヤリティ分配を示すため、WikiArt と FlickrLogo-27 で経験的評価を提供する。）

Figure 1 : Overview of our method. (a) The artists provide their copyrighted artworks as (part of) the training data for the generative AI model. (b) A user prompts the generative AI and obtains a new artwork. (c) We assess the contribution of each artist to the AI-generated artwork using the Shaple

実験結果

リサーチクエスチョン

RQ1Shapley-value-based フレームワークは、生成AIアーティファクトに対する各著作権所有者データの寄与を公正に帰属させることができるか？
RQ2提案された SRS 機構は、異なるデータの組み合わせやプロンプトに対して解釈可能で公正なロイヤリティ分配を生み出すか？
RQ3データ間の相互作用やデータ重複の可能性は、Shapley フレームワーク下の帰属にどう影響するか？
RQ4Monte Carlo などの近似を用いて、実用的で大規模なAI取引設定にこのアプローチはスケールできるか？
RQ5収益分配契約において、開発者とデータ所有者の取り分をどのようにバランスさせるべきか？

主な発見

SRS の値は、生成された内容が特定の著作権所有者データに密接に類似する場合にピークを迎え、効果的な帰属を示す。
このフレームワークは、協力体全体で影響力が大きいデータソースにより高いロイヤリティを割り当て、著作権のないデータが使用される場合にはほぼ均一なシェアを生み出す。
スタイルミキシングプロンプトを用いた実験は、多様なソース入力を報いるこの枠組みの能力を示している。
SRSによるデータソース寄与のランキングは、期待される関連性と一致する（例：CIFAR-100 スタイル類似性の順序）。
推論手順の変更を必要とせず、公正性と解釈性を維持する。

Figure 2 : Evaluation of the SRS using the WikiArt (upper) and FlickrLogo-27 datasets (lower): Each row displays example target images ( $x^{(\text{gen})}$ ’s) for which the SRS is assessed. Left: The heatmap of the SRS of copyright owners in producing the original paintings from different artists (

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。