QUICK REVIEW

[論文レビュー] SAFE: Similarity-Aware Multi-Modal Fake News Detection

Xinyi Zhou, Jindi Wu|arXiv (Cornell University)|Feb 19, 2020

Misinformation and Its Impacts参考文献 29被引用数 94

ひとこと要約

SAFE はテキストとビジュアル内容をそのクロスモーダル類似性とともにモデル化し偽ニュースを検出する。PolitiFact および GossipCop データセットのベースラインを上回る。

ABSTRACT

Effective detection of fake news has recently attracted significant attention. Current studies have made significant contributions to predicting fake news with less focus on exploiting the relationship (similarity) between the textual and visual information in news articles. Attaching importance to such similarity helps identify fake news stories that, for example, attempt to use irrelevant images to attract readers' attention. In this work, we propose a $\mathsf{S}$imilarity-$\mathsf{A}$ware $\mathsf{F}$ak$\mathsf{E}$ news detection method ($\mathsf{SAFE}$) which investigates multi-modal (textual and visual) information of news articles. First, neural networks are adopted to separately extract textual and visual features for news representation. We further investigate the relationship between the extracted features across modalities. Such representations of news textual and visual information along with their relationship are jointly learned and used to predict fake news. The proposed method facilitates recognizing the falsity of news articles based on their text, images, or their "mismatches." We conduct extensive experiments on large-scale real-world data, which demonstrate the effectiveness of the proposed method.

研究の動機と目的

Motivation to improve fake news detection by leveraging cross-modal similarity between text and images.
Develop a multi-modal framework that learns textual, visual, and relational representations of news articles.
Demonstrate that incorporating cross-modal similarity improves fake news prediction over single-modal baselines.

提案手法

Extract textual features using an extended Text-CNN with an extra fully connected layer.
Extract visual features via a Text-CNN on features from a pre-trained image2sentence pipeline, followed by a transform to a d-dimensional representation.
Define a cross-modal similarity measure s between text and image representations and a dedicated loss term for similarity-based signaling of mismatch.
Combine multi-modal features and cross-modal similarity through a joint learning objective with two loss components (prediction loss and similarity loss).
Optimize end-to-end with a joint gradient-based procedure (Algorithm 1) to update textual, visual, and predictor parameters.

実験結果

リサーチクエスチョン

RQ1Can a similarity-aware multi-modal model detect fake news more effectively than single-modal or naive multi-modal baselines?
RQ2Does incorporating cross-modal similarity help identify mismatches between text and images that indicate fakery?
RQ3What is the contribution balance between multi-modal content and cross-modal similarity for accurate fake news detection?

主な発見

SAFE achieves higher accuracy and F1 scores than LIWC, VGG-19, and att-RNN baselines on PolitiFact and GossipCop datasets.
Integrating textual, visual, and cross-modal similarity outperforms variants that omit either modality or the similarity term.
Textual information tends to be more informative than visual information, but both modalities plus their relationship yield the best results.
The method remains effective when adjusting the relative weights between multi-modal features and cross-modal similarity, indicating both contribute meaningfully to detection.
Case studies show lower similarity scores often align with fake news, illustrating the model’s capability to flag mismatches.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。