QUICK REVIEW

[論文レビュー] On Hate Scaling Laws For Data-Swamps

Abeba Birhane, Vinay Uday Prabhu|arXiv (Cornell University)|Jun 22, 2023

Hate Speech and Cyberbullying Detection被引用数 10

ひとこと要約

この論文は LAION-400M および LAION-2B-en を監査し、データセットの規模がヘイト内容にどう影響するかを評価し、これらのデータセットで訓練された視覚・言語モデルの下流バイアスを、Chicago Face Dataset をプローブとして用いて評価します。

ABSTRACT

`Scale the model, scale the data, scale the GPU-farms' is the reigning sentiment in the world of generative AI today. While model scaling has been extensively studied, data scaling and its downstream impacts remain under explored. This is especially of critical importance in the context of visio-linguistic datasets whose main source is the World Wide Web, condensed and packaged as the CommonCrawl dump. This large scale data-dump, which is known to have numerous drawbacks, is repeatedly mined and serves as the data-motherlode for large generative models. In this paper, we: 1) investigate the effect of scaling datasets on hateful content through a comparative audit of the LAION-400M and LAION-2B-en, containing 400 million and 2 billion samples respectively, and 2) evaluate the downstream impact of scale on visio-linguistic models trained on these dataset variants by measuring racial bias of the models trained on them using the Chicago Face Dataset (CFD) as a probe. Our results show that 1) the presence of hateful content in datasets, when measured with a Hate Content Rate (HCR) metric on the inferences of the Pysentimiento hate-detection Natural Language Processing (NLP) model, increased by nearly $12\%$ and 2) societal biases and negative stereotypes were also exacerbated with scale on the models we evaluated. As scale increased, the tendency of the model to associate images of human faces with the `human being' class over 7 other offensive classes reduced by half. Furthermore, for the Black female category, the tendency of the model to associate their faces with the `criminal' class doubled, while quintupling for Black male faces. We present a qualitative and historical analysis of the model audit results, reflect on our findings and its implications for dataset curation practice, and close with a summary of our findings and potential future work to be done in this area.

研究の動機と目的

400M から 2B のサンプル規模の増加が画像テキスト対のヘイト内容にどのように影響するかを評価する。
2つのデータセットの変種で訓練された視覚-言語モデルの下流の人種バイアスを測定する。
データセットのキュレーションの含意と公平なデータ実践の推奨を提供する。

提案手法

各 160 個の LAION シャードから 0.1百万行をサブサンプリングし、ヘイト内容分析のために代替テキストの説明を抽出する。
最先端の自然言語処理モデルである Pysentimiento を適用して、ヘイト・コンテンツ率（HCR）を、ヘイト、ターゲット、攻撃のカテゴリ全体で算出する。
Any-of-the-three-HCR を品質検出子として定義・計算する。
400M と 2B-en データセットを比較するために、二項比率信頼区間分析（Wilson score, 95% CI）を実施する。
Welch の二標本 t 検定を用いて、ファイル単位の HCR が 2B-en の方が 400M より高いかを検証する。
OpenCLIP を固定アーキテクチャで用いて、Chicago Face Dataset（CFD）に対するゼロショット画像分類を通じて下流のバイアスを監査する。

Figure 1: Experimentation details: Dataset sub-sampling, inference using Pysentimiento and thresholding for estimating Hate Content Rate (HCR).

実験結果

リサーチクエスチョン

RQ1LAION-400M を LAION-2B-en にスケールアップすると、代替テキストのヘイト・ターゲット・攻撃内容が増えるか？
RQ2データセットのスケールは、これらのデータセットで訓練された視覚-言語モデルの下流バイアスにどう影響するか？
RQ3ファイルレベルの HCR 統計は、シャード全体のグローバルな HCR 統計を代表するか？
RQ4より大規模なデータセットは、モデル出力における Black faces への非人間化バイアスを悪化させるか？

主な発見

代替テキストのヘイト内容はスケールとともに増加; Any-of-the-three-HCR は LAION-2B-en で 400M より高く、ヘイト内容で最大 0.7% 対 0.6%。
二項 CI 分析により、2B-en と 400M を比較した際、HCR の変化の下限は 12.26% であることが示された。
ファイル単位の HCR は、ヘイト・ターゲット・攻撃のカテゴリすべてで 400M より統計的に高く（Welch t検定、p値 < 1e-4）。
下流のバイアス監査は、より大きなデータセットで訓練された場合、Black faces に対する非人間化の関連が増加し、CFD プローブで最高の人間クラス予測が減少し、犯罪関連性の関連が高くなることを示している。

Figure 2: HCR curves for the LAION400M and LAION-2B-en datasets using Pysentimiento outputs. As the dataset is scaled, there is a statistically significant increase in hateful content.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。