QUICK REVIEW

[論文レビュー] Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting

Masahiro Kaneko, Danushka Bollegala|arXiv (Cornell University)|Jan 28, 2024

Topic Modeling被引用数 6

ひとこと要約

この論文は、チェーン・オブ・思考プロンプティングが、スケールしないカウント課題におけるLLMsの性別バイアスを緩和し得ること、そしてCoTデバイアス緩和がQAやNLIの下流タスクへ転移し得ることを示している。

ABSTRACT

There exist both scalable tasks, like reading comprehension and fact-checking, where model performance improves with model size, and unscalable tasks, like arithmetic reasoning and symbolic reasoning, where model performance does not necessarily improve with model size. Large language models (LLMs) equipped with Chain-of-Thought (CoT) prompting are able to make accurate incremental predictions even on unscalable tasks. Unfortunately, despite their exceptional reasoning abilities, LLMs tend to internalize and reproduce discriminatory societal biases. Whether CoT can provide discriminatory or egalitarian rationalizations for the implicit information in unscalable tasks remains an open question. In this study, we examine the impact of LLMs' step-by-step predictions on gender bias in unscalable tasks. For this purpose, we construct a benchmark for an unscalable task where the LLM is given a list of words comprising feminine, masculine, and gendered occupational words, and is required to count the number of feminine and masculine words. In our CoT prompts, we require the LLM to explicitly indicate whether each word in the word list is a feminine or masculine before making the final predictions. With counting and handling the meaning of words, this benchmark has characteristics of both arithmetic reasoning and symbolic reasoning. Experimental results in English show that without step-by-step prediction, most LLMs make socially biased predictions, despite the task being as simple as counting words. Interestingly, CoT prompting reduces this unconscious social bias in LLMs and encourages fair predictions.

研究の動機と目的

LLMsが非スケーリングタスクで性別バイアスを内部化する仕組みの理解を動機付ける。
算術/記号的推論と性別語の分類を組み合わせた、MGBR（Multi-step Gender Bias Reasoning）ベンチマークを紹介する。
チェーン・オブ・思考（CoT） promptingが、標準プロンプトや単純なデバイアス回避プロンプトと比較してバイアスを低減するかを評価する。
MGBRのバイアススコアと既存の外的/内的バイアスベンチマークとの相関を評価する。

提案手法

女性語のリスト、男性語のリスト、性別職業ステレオタイプ語のリストを用いてMGBRベンチマークを構築する。
LLMsに対して、カウントの前に各語を女性形または男性形に分類させ、カウントと語義解釈を組み合わせる。
ゼロショット、 Few-shot、 Debiasing Prompt（DP）を、CoTあり/なしで比較する（Zero-shot+CoT、 Few-shot+CoT）。
23のオープン/クローズドLLMに対してプロンプトを用いて評価し、語彙リストに職業語を含める場合と含めない場合の正確さの差としてバイアスを測定する。
MGBRのバイアススコアを外的ベンチマーク（ BBQ、BNLI）と内的ベンチマーク（ CP、SS）と相関させる。
CoTデバイアス緩和能力に対するモデル規模と訓練の影響を分析し、下流ベンチマークにおけるCoTとDPデバイアス緩和を比較する。

実験結果

リサーチクエスチョン

RQ1CoT promptingは、ゼロショットおよびFew-shotプロンプトと比較して、スケールしないカウント課題（MGBR）で性別バイアスを低減するか。
RQ2DPによる単純なデバイアス回避はMGBRに対して有効か、それともバイアス緩和にはCoTが必要か。
RQ3MGBRバイアススコアは内的 versus 外的バイアスベンチマークとどのように相関するか。
RQ4より大きなモデルや追加訓練を受けたモデルはCoTデバイアス緩和能力をより強く示すか。
RQ5CoTデバイアス緩和は、MGBRからBBQやBNLIのような下流タスクへ転移するか。

主な発見

ゼロショットプロンプトは、多くのLLMでMGBRに対して高い性別バイアスを生じさせる。
Few-shotプロンプトはゼロショットと比較してバイアスを低減する。
DPデバイアス緩和はMGBRでの改善がほとんどない、またはほとんどない一方で、CoTデバイアス緩和はより効果的である。
CoTプロンプトは多くのモデルでバイアスを低減し、バイアス緩和はモデルサイズと訓練とともに改善する。
MGBRのバイアススコアは内的ベンチマーク（ CP、SS）より外的ベンチマーク（ BBQ、BNLI）とより強く相関する。
CoTデバイアス緩和は、多くのモデルにおいてBBQおよびBNLIの下流タスクのバイアスを改善し、より大きなモデルはより大きな利益を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。