QUICK REVIEW

[論文レビュー] To Act or React: Investigating Proactive Strategies For Online Community Moderation

Hussam Habib, Maaz Bin Musa|arXiv (Cornell University)|Jun 27, 2019

Hate Speech and Cyberbullying Detection被引用数 32

ひとこと要約

本稿では、説明可能な機械学習を用いて、どのサブレdditが憎悪的または危険なコミュニティに進化するかを予測することで、Redditにおける予防的モデレーションを提案している。構造的つながり、特に既知の問題のあるサブレdditへの接続が、最も強い予測要因であることが判明した。また、こうしたコミュニティに参加することはユーザーの礼儀正しさを損なうが、現在の禁止や隔離措置では、この行動を是正できず、より効果的な介入の必要性が浮き彫りになった。

ABSTRACT

Reddit administrators have generally struggled to prevent or contain such discourse for several reasons including: (1) the inability for a handful of human administrators to track and react to millions of posts and comments per day and (2) fear of backlash as a consequence of administrative decisions to ban or quarantine hateful communities. Consequently, as shown in our background research, administrative actions (community bans and quarantines) are often taken in reaction to media pressure following offensive discourse within a community spilling into the real world with serious consequences. In this paper, we investigate the feasibility of proactive moderation on Reddit -- i.e., proactively identifying communities at risk of committing offenses that previously resulted in bans for other communities. Proactive moderation strategies show promise for two reasons: (1) they have potential to narrow down the communities that administrators need to monitor for hateful content and (2) they give administrators a scientific rationale to back their administrative decisions and interventions. Our work shows that communities are constantly evolving in their user base and topics of discourse and that evolution into hateful or dangerous (i.e., considered bannable by Reddit administrators) communities can often be predicted months ahead of time. This makes proactive moderation feasible. Further, we leverage explainable machine learning to help identify the strongest predictors of evolution into dangerous communities. This provides administrators with insights into the characteristics of communities at risk becoming dangerous or hateful. Finally, we investigate, at scale, the impact of participation in hateful and dangerous subreddits and the effectiveness of community bans and quarantines on the behavior of members of these communities.

研究の動機と目的

サブレdditが時間経過とともに変化するかどうかを調査し、ユーザーのベースやトピックが安定しているという仮定に疑問を呈する。
憎悪的または危険なコミュニティに進化するおそれのあるサブレdditを事前に予測できるかを特定する。
憎悪的サブレdditに参加した後のユーザー行動に及ぼす影響、およびコミュニティレベルの対策（禁止・隔離）の影響を、Reddit全体で評価する。
管理者が意思決定を正当化し、改善するためのデータドリブンで説明可能なツールを提供する。
サブレdditが有害な内容に進化するのを予測する構造的および行動的特徴を同定する。

提案手法

類似度指標を用いて、ユーザーのベースとトピック分布の変化を定量化することで、サブレdditの進化を追跡した。
コミュニティ、ユーザー、モデレーター、構造的特徴に基づいて、サブレdditが禁止または隔離されるかどうかを予測する説明可能な機械学習モデルを開発した。
特にサブレddit間のユーザー相互作用を含むネットワーク特徴を、モデルの主要な予測要因とした。
攻撃的サブレdditに参加した前後におけるユーザー行動の大規模な分析を実施し、コミュニティレベルの対策（禁止・隔離）後の影響を評価した。
因果効果を隔離するために、攻撃的サブレdditに参加したユーザーと、同様のユーザーで参加しなかったグループを比較する制御群法を採用した。
サブレdditの進化を予測する上で最も影響力のある特徴を同定・順位付けする技術を用い、管理者の解釈可能性を向上させた。

実験結果

リサーチクエスチョン

RQ1サブレdditは時間経過とともに安定したユーザーのベースとトピックを維持するのか、それとも継続的に進化するのか？
RQ2サブレdditが憎悪的または危険なコミュニティに進化するかどうかを事前に予測できるか？
RQ3攻撃的サブレdditに参加することは、他のコミュニティにおけるユーザーの礼儀正しさにどのように影響するか？
RQ4Redditの現在のコミュニティレベルの対策（禁止・隔離）は、関係者ユーザーの不適切な行動を効果的に減らすことができるか？
RQ5サブレdditの将来的な有害行動を予測する上で、どのコミュニティレベルの特徴が最も強い予測要因となるか？

主な発見

サブレdditはユーザーのベースとトピックの両方で高い頻度で継続的に進化しており、静的モデレーションでは効果が薄いことが判明した。
構造的特徴、特に既知の問題のあるサブレdditへのユーザーのつながりが、サブレdditの将来的な有害行動を予測する上で最も強い要因であった。
攻撃的サブレdditに参加することは、初期行動を補正しても、他のコミュニティにおけるユーザーの礼儀正しさを顕著に低下させる。
禁止や隔離などのコミュニティレベルの対策は、ユーザーの礼儀正しさの低下を是正できず、その有効性が限定的であることが示された。
将来的に禁止または隔離されたサブレdditの進化パターンは、安定的または非攻撃的コミュニティとは明確に異なり、早期検出が可能であることがわかった。
説明可能な機械学習モデルは、将来的な有害行動を予測する上で妥当な高い精度を達成しており、予防的モデレーションの科学的根拠を提供した。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。