QUICK REVIEW

[論文レビュー] Training-Free and Interpretable Hateful Video Detection via Multi-stage Adversarial Reasoning

Shuonan Yang, Yuchen Zhang|arXiv (Cornell University)|Jan 21, 2026

Hate Speech and Cyberbullying Detection被引用数 0

ひとこと要約

MARS は訓練不要の多段階対抗推論フレームワークで、VLM を用いて憎悪動画を検出し、解釈可能で根拠ベースの正当化を提供し、訓練データ依存を回避します。

ABSTRACT

Hateful videos pose serious risks by amplifying discrimination, inciting violence, and undermining online safety. Existing training-based hateful video detection methods are constrained by limited training data and lack of interpretability, while directly prompting large vision-language models often struggle to deliver reliable hate detection. To address these challenges, this paper introduces MARS, a training-free Multi-stage Adversarial ReaSoning framework that enables reliable and interpretable hateful content detection. MARS begins with the objective description of video content, establishing a neutral foundation for subsequent analysis. Building on this, it develops evidence-based reasoning that supports potential hateful interpretations, while in parallel incorporating counter-evidence reasoning to capture plausible non-hateful perspectives. Finally, these perspectives are synthesized into a conclusive and explainable decision. Extensive evaluation on two real-world datasets shows that MARS achieves up to 10% improvement under certain backbones and settings compared to other training-free approaches and outperforms state-of-the-art training-based methods on one dataset. In addition, MARS produces human-understandable justifications, thereby supporting compliance oversight and enhancing the transparency of content moderation workflows. The code is available at https://github.com/Multimodal-Intelligence-Lab-MIL/MARS.

研究の動機と目的

訓練ベースの憎悪動画検 detector の希少性と非透明性に対処する。
意思決定に人間が理解できる正当化を生成する訓練不要フレームワークを開発する。
憎悪/非憎悪の解釈間の明示的な根拠比較を活用する。
メタ分析段階を通じて根拠を統合し、監査可能な意思決定を提供する。

提案手法

推論中にモデルパラメータを更新しない四段階フレームワーク。
Stage 1 はサンプリングされたフレームと音声文字起こしから客観的な内容説明を生成。
Stage 2 は憎悪仮説の下で憎悪を支持する証拠・推論・信頼度を推定。
Stage 3 は非憎悪仮説の下で非憎悪の証拠・推論・信頼度を推定。
Stage 4 は競合する仮説をメタ分析的に統合し、根拠付きの構造化された判断を出力する。

実験結果

リサーチクエスチョン

RQ1訓練不要のVLMベースシステムは、ラベル付き微調整データなしで競争力のある憎悪検出精度を達成できるか？
RQ2明示的な根拠ベースの多段階推論は解釈性を向上させ、偽陽性を減らすか？
RQ3訓練ベースと他の訓練不要ベースラインと比較して、言語とバックボーンを跨いで MARS はどう評価されるか？
RQ4各推論段階が全体の精度と macro-F1 に与える影響はどれか？

主な発見

MARS は、データセット横断で訓練不要のベースラインに対し高い精度と競争力のある精度を達成する。
English HateMM で、MARS はすべての指標で訓練不要ベースラインを一貫して上回り、訓練ベースのモデルと競合する水準を維持する。
Chinese MultiHateClip では、MARS は同等の精度を維持しつつ、明確な精度の優位性を示し、訓練ベース手法のいくつかより最大で 7% 高い精度を達成。
アブレーションにより、客観的説明や仮説ベース構造を除くと精度と macro-F1 が低下し、段階の必須性を確認。
フレームサンプリング（16-32 フレーム）とより大きいバックボーンは性能を向上させ、スケーラビリティと安定性を示す。
MARS は、監査可能性のための粒度の高い人間理解可能な正当化と明示的な証拠チェーンを提供する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。