QUICK REVIEW

[論文レビュー] Stable Bias: Analyzing Societal Representations in Diffusion Models

Alexandra Sasha Luccioni, Christopher Akiki|arXiv (Cornell University)|Mar 20, 2023

Computational and Text Analysis Methods被引用数 56

ひとこと要約

本論文は、性別・民族性のプロンプトを変化させ、職業プロンプトと比較することで、テキストから画像への拡散（diffusion）システムにおける社会的バイアスを検証する手法を提案し、Stable Diffusion および DALL·E 2 に適用し、オープンなツールとデータセットを公開した。

ABSTRACT

As machine learning-enabled Text-to-Image (TTI) systems are becoming increasingly prevalent and seeing growing adoption as commercial services, characterizing the social biases they exhibit is a necessary first step to lowering their risk of discriminatory outcomes. This evaluation, however, is made more difficult by the synthetic nature of these systems' outputs: common definitions of diversity are grounded in social categories of people living in the world, whereas the artificial depictions of fictive humans created by these systems have no inherent gender or ethnicity. To address this need, we propose a new method for exploring the social biases in TTI systems. Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts, and comparing it to the variation engendered by spanning different professions. This allows us to (1) identify specific bias trends, (2) provide targeted scores to directly compare models in terms of diversity and representation, and (3) jointly model interdependent social variables to support a multidimensional analysis. We leverage this method to analyze images generated by 3 popular TTI systems (Dall-E 2, Stable Diffusion v 1.4 and 2) and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents. We also release the datasets and low-code interactive bias exploration platforms developed for this work, as well as the necessary tools to similarly evaluate additional TTI systems.

研究の動機と目的

生成画像における性別と民族性の柔軟な代理指標を定義し、社会的変異を研究する。
TTI システムにおける職業間の表現を監査するための、プロンプトベースの指標を開発する。
出力における社会的に周縁化された属性の過小表現を明らかにする定量的・定性的分析を提供する。
TTI システムのより広範な評価を可能にするデータセットとローコードの対話型プラットフォームを提供する。

提案手法

民族性・性別・職業を組み合わせたパターンを用いてプロンプトを生成する（U.S. BLS の 146 の職業）。
分析には2つのモダリティを用いる：テキストベース（画像キャプションと VQA の語）および画像ベース（画像の埋め込みのクラスタリング）によるバイアス評価。
プロンプトのアイデンティティ表現に結びつく変動を捉えるため、画像埋め込みを24領域にクラスタリング。
クラスタ-領域の分布を米国労働統計局(BLS)の人口統計と関連づけて集計（四分位比較）する。
定性的探索のための対話型ツール（Diffusion Bias Explorer、Average Face Comparison Tool、k-NN Explorer）を提供する。

実験結果

リサーチクエスチョン

RQ1職業関連のプロンプトを用いたとき、拡散ベースの TTI システムは性別と民族性の描写にどのような差異を示すか？
RQ2TTI の出力は、職業全体で現実の人口統計分布を再現・再生産・悪化させるか？
RQ3画像埋め込みのクラスタリングは、単純なラベル割り当てを超えた多次元的なバイアスを明らかにできるか？
RQ4拡散モデルの定性的・大規模な監査を促進するインタラクティブツールとは何か？

主な発見

三つのシステムすべてが米国の労働人口統計と相関を示すが、周縁化されたアイデンティティの過小表現は一貫して異なる程度で見られる。
キャプションと VQA の出力は多くのプロンプトで性別マーカーを明らかにするが、VQA はキャプションより性別特異性が低い（性別語を用いた場合の約 97.66% 対 45.56%）。
アイデンティティ領域のクラスタリングは特定の性別/民族プロンプトに関連する領域を識別する。いくつかの領域（例：領域 4）は主に White men を反映する一方、他の領域はより多様な関連を示す。
システム全体で、女性と Black の個人の過小表現は、より多様な職業においてより顕著であり、四分位分析では DALL·E 2 が最も強いバイアスを示す。
このフレームワークは追加の TTI システムへ一般化可能で、共有の事前学習だけでなくファインチューニングに起因するばらつきを示唆している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。