QUICK REVIEW

[論文レビュー] Fake News Detection on Social Media using Geometric Deep Learning

Federico Monti, Fabrizio Frasca|arXiv (Cornell University)|Feb 10, 2019

Misinformation and Its Impacts参考文献 45被引用数 188

ひとこと要約

本論文は、Propagationとグラフベースの偽ニュース検出器を、Twitterカスケード上で幾何学的ディープラーニングを用いて提案し、高いROC AUCと早期検出性能を達成する。

ABSTRACT

Social media are nowadays one of the main news sources for millions of people around the globe due to their low cost, easy access and rapid dissemination. This however comes at the cost of dubious trustworthiness and significant risk of exposure to 'fake news', intentionally written to mislead the readers. Automatically detecting fake news poses challenges that defy existing content-based analysis approaches. One of the main reasons is that often the interpretation of the news requires the knowledge of political or social context or 'common sense', which current NLP algorithms are still missing. Recent studies have shown that fake and real news spread differently on social media, forming propagation patterns that could be harnessed for the automatic fake news detection. Propagation-based approaches have multiple advantages compared to their content-based counterparts, among which is language independence and better resilience to adversarial attacks. In this paper we show a novel automatic fake news detection model based on geometric deep learning. The underlying core algorithms are a generalization of classical CNNs to graphs, allowing the fusion of heterogeneous data such as content, user profile and activity, social graph, and news propagation. Our model was trained and tested on news stories, verified by professional fact-checking organizations, that were spread on Twitter. Our experiments indicate that social network structure and propagation are important features allowing highly accurate (92.7% ROC AUC) fake news detection. Second, we observe that fake news can be reliably detected at an early stage, after just a few hours of propagation. Third, we test the aging of our model on training and testing data separated in time. Our results point to the promise of propagation-based approaches for fake news detection as an alternative or complementary strategy to content-based approaches.

研究の動機と目的

文脈や常識的知識の必要性のため、コンテンツベース手法だけでは偽ニュース検出が難しいという課題を動機づける。
コンテンツ、ユーザー、およびネットワーク特徴を統合するグラフベースの伝播対応モデルを提案する。
大規模なTwitterデータセット全体で、伝播パターンが偽ニュース検出に強い信号を提供することを示す。

提案手法

各畳み込み層でグラフアテンションを用いた、2つのグラフ畳込み層と2つの全結合層を備えた4層のGraph CNNを提案する。
ユーザープロファイル、ユーザー活動、ソーシャルネットワーク構造、およびニュースの伝播を、各URL/カスケードごとに1つのグラフ入力Guとして統合する。
URLを共有するツイートをフォロー関係および拡散経路で結合して入力グラフを計算する。エッジは多様な関係特徴を保持し、畳込み層の注意機構を通じて更新される。
ヒンジ損失（正則化なし）とSELU活性化を用いて訓練し、学習率5e-4のAMSGradを使用する。
Twitterデータ（2013–2018）において、URL別およびカスケード別の設定で評価する。カスケードは6ツイート以上、拡散ウィンドウは24時間。

実験結果

リサーチクエスチョン

RQ1伝播およびソーシャルネットワーク構造特徴のみ（コンテンツなし）で、Twitter上の偽ニュースを信頼性高く検出できるか？
RQ2URLとカスケードで分類した場合の性能差はどうなるか、またどの程度早く検出が有効になり得るか？
RQ3訓練データとテストデータの間の時間経過（経時変化）に対してモデルはどれくらい耐性があるか？
RQ4最小カスケードサイズが検出性能に与える影響は何か？
RQ5どの特徴群（ユーザープロファイル、活動、ネットワーク/拡散、コンテンツ）が予測に最も寄与するか？

主な発見

URL-wise ROC AUCは5つの折りたたみで92.70% (±1.80)。
Cascade-wise ROC AUCは5つの折りたたみで88.30% (±2.74)。
拡散時間が長くなるにつれて精度が向上し、URL-wiseでは約15時間、cascade-wise設定では約7時間で飽和する。
アブレーション分析では、ユーザープロファイルとネットワーク/拡散特徴が両設定で最重要であることを示す。
モデルの経時劣化：URL-wiseの性能は約180日以降低下し、cascade-wiseはより緩やかに劣化（260日後で≤4%）。”
t-SNE可視化は、モデルが学習した信頼できるユーザーと信頼できないユーザーの明確なクラスタリングを示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。