QUICK REVIEW

[論文レビュー] WildChat: 1M ChatGPT Interaction Logs in the Wild

Wenting Zhao, Xiang Ren|arXiv (Cornell University)|May 2, 2024

Privacy-Preserving Technologies in Data被引用数 8

ひとこと要約

WildChatはデモグラフィックとヘッダデータを伴う実ユーザー–ChatGPT会話100万件（2.5百万発話）を公開し、毒性と多言語利用を分析し、オープンソースモデルの指示フォロー学習のための有用性を実証する。

ABSTRACT

Chatbots such as GPT-4 and ChatGPT are now serving millions of users. Despite their widespread use, there remains a lack of public datasets showcasing how these tools are used by a population of users in practice. To bridge this gap, we offered free access to ChatGPT for online users in exchange for their affirmative, consensual opt-in to anonymously collect their chat transcripts and request headers. From this, we compiled WildChat, a corpus of 1 million user-ChatGPT conversations, which consists of over 2.5 million interaction turns. We compare WildChat with other popular user-chatbot interaction datasets, and find that our dataset offers the most diverse user prompts, contains the largest number of languages, and presents the richest variety of potentially toxic use-cases for researchers to study. In addition to timestamped chat transcripts, we enrich the dataset with demographic data, including state, country, and hashed IP addresses, alongside request headers. This augmentation allows for more detailed analysis of user behaviors across different geographical regions and temporal dimensions. Finally, because it captures a broad range of use cases, we demonstrate the dataset's potential utility in fine-tuning instruction-following models. WildChat is released at https://wildchat.allen.ai under AI2 ImpACT Licenses.

研究の動機と目的

指示フォロー用データへのアクセスギャップを埋めるために、巨大で実世界かつ多言語のデータセットを提供する。
実世界の使用パターン・人口統計・毒性をユーザー–ChatGPTの相互作用において特徴づける。
WildChatデータセットを用いたオープンソースの指示フォローモデルの微調整への有用性を評価する。
このようなデータを公開する際の基準分析と倫理的配慮を提供する。

提案手法

二つの公開アクセス可能なチャットサービス（GPT-3.5-TurboとGPT-4）をHugging Face Spaces上に展開し、ユーザー同意つきの転記を収集する。
内容・IPアドレス・リクエストヘッダを用いて、緩和されたIP一致を許可しつつ、発話を全体の会話に前処理して結びつける。
PresidioでPIIを匿名化、SpacyでNER、IPをハッシュ化；GeoLite2を用いてIPを地理エンティティへマッピングする。
言語とプロンプトカテゴリ分類（例：英語プロンプト、主要言語、タスクカテゴリ）でデータに注釈を付与する。
毒性をDetoxifyとOpenAI Moderation APIで評価し、ジャイラビング（ jailbreaking）プロンプトを分析する。
WildChatでの学習によりWildLlamaを作成し、MT-benchとLLM Judgeで評価する。

実験結果

リサーチクエスチョン

RQ1WildChatで捉えられた実世界の多言語利用パターン・人口統計は何か。
RQ2実世界の会話におけるユーザー発話とチャットボット発話の毒性はどの程度で、複数の検出器はどれほど一致するか。
RQ3WildChatを用いたオープンソースの指示フォロー型モデル（例：WildLlama）の微調整は可能か、これらのモデルは標準ベンチマークでどのように性能を示すか。

主な発見

#Convs	#Users	#!Turns	#!User Tok	#!Chatbot Tok	#!Langs
Alpaca	52,002	-	1.00	19.67 ±15.19	64.51 ±64.85	1
Open Assistant	46,283	13,500	2.34	33.41 ±69.89	211.76 ±246.71	11
Dolly	15,011	-	1.00	110.25 ±261.14	91.14 ±149.15	1
ShareGPT	94,145	-	3.51	94.46 ±626.39	348.45 ±269.93	41
LMSYS-Chat-1M	1,000,000	210,479	2.02	69.83 ±143.49	215.71 ±1858.09	65
WildChat	1,039,785	204,736	2.54	295.58 ±1609.18	441.34 ±410.91	68

WildChatは1,039,785件の会話（2,639,415発話）を含み、204,736の固有IPから取得、GPT-4の使用は約24%、GPT-3.5-Turboは約76%。
データセットは68言語にまたがり、英語が全発話の53%を占める。主要言語には英語、中国語、ロシア語が含まれる。
毒性発話は蔓延しており、ユーザー発話の10.46%、チャットボット発話の6.58%がDetoxifyまたはModerationのいずれかでフラグされ、両方でフラグされるのは3.73%のみ。
フラグされたユーザー発話のうち性的毒性が支配的（Moderationのカテゴリで88.51%）。
WildChatの言語多様性と実ユーザーのプロンプトは微調整の高いデータカバーを生み、WildLlamaはWildChatで訓練され、MT-bench指標で一部のオープンソースベースラインを上回るが、独自のGPT-3.5/4には及ばない。
ジャイラビリング分析では、JailMommyなど顕著なプロンプトが高い成功率を示し、防御ニーズの進化を示唆している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。