Skip to main content
QUICK REVIEW

[论文解读] Fake News Detection on Social Media: A Data Mining Perspective

Kai Shu, Amy Sliva|arXiv (Cornell University)|Aug 7, 2017
Misinformation and Its Impacts参考文献 71被引用 599
一句话总结

本综述从数据挖掘的角度审视社交媒体上的假新闻检测,涵盖定义、特征(内容与社交背景)、检测模型、数据集、评估指标以及未来方向。

ABSTRACT

Social media for news consumption is a double-edged sword. On the one hand, its low cost, easy access, and rapid dissemination of information lead people to seek out and consume news from social media. On the other hand, it enables the wide spread of "fake news", i.e., low quality news with intentionally false information. The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Therefore, fake news detection on social media has recently become an emerging research that is attracting tremendous attention. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ineffective or not applicable. First, fake news is intentionally written to mislead readers to believe false information, which makes it difficult and nontrivial to detect based on news content; therefore, we need to include auxiliary information, such as user social engagements on social media, to help make a determination. Second, exploiting this auxiliary information is challenging in and of itself as users' social engagements with fake news produce data that is big, incomplete, unstructured, and noisy. Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media.

研究动机与目标

  • 明确传统媒介与社交媒介中假新闻的定义与特征。
  • 总结用于检测社交媒体假新闻的数据挖掘方法。
  • 比较基于内容与基于社交背景的检测方法。
  • 讨论数据集、评估指标及未解决的问题,为未来研究指明方向。

提出的方法

  • 将假新闻检测方法分为News Content Models与Social Context Models。
  • 描述内容特征(语言特征与视觉特征)以及社交背景特征(用户、帖子与网络基础)。
  • 综述基于知识与基于风格的内容模型检测方法。
  • 概述假新闻研究中使用的数据集类型与评估指标。
  • 讨论社交媒体检测的开放问题与未来发展方向。

实验结果

研究问题

  • RQ1在传统媒介和社交媒介中使用的假新闻定义是什么,它们有何不同?
  • RQ2来自新闻内容与社交背景的哪些特征对社交媒体上的假新闻检测更有效?
  • RQ3存在哪些检测模型,它们如何按输入来源(内容与背景)进行分类?
  • RQ4在这一研究领域中典型的数据集与评估指标是什么?
  • RQ5未来工作有哪些尚待解决的挑战和有前景的方向?

主要发现

  • 由于对新闻内容的有意操控,社交媒体上的假新闻需要来自内容之外的辅助信息。
  • 社交背景特征(用户、帖子、网络)为检测假新闻提供重要信号。
  • 检测方法可分为基于知识与基于风格的内容模型,以及与内容无关的社交背景模型。
  • 传统假新闻与社交媒体特定模式(如机器人、喷子、回音室效应)之间存在差异。
  • 该领域处于早期阶段,存在若干开放问题与未来研究机会。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。