QUICK REVIEW

[论文解读] Stop clickbait: detecting and preventing clickbaits in online news media

Abhijnan Chakraborty, Bhargavi Paranjape|arXiv (Cornell University)|Aug 18, 2016

Misinformation and Its Impacts参考文献 12被引用 200

一句话总结

本文提出了一种集成于浏览器扩展中的自动化点击诱饵检测系统，可识别煽动性标题，并根据用户偏好警告或阻止这些标题。该系统在多个新闻网站的离线和在线实验中，检测点击诱饵的准确率达到93%，个性化阻止的准确率达到89%。

ABSTRACT

Most of the online news media outlets rely heavily on the revenues generated from the clicks made by their readers, and due to the presence of numerous such outlets, they need to compete with each other for reader attention. To attract the readers to click on an article and subsequently visit the media site, the outlets often come up with catchy headlines accompanying the article links, which lure the readers to click on the link. Such headlines are known as Clickbaits. While these baits may trick the readers into clicking, in the long-run, clickbaits usually don't live up to the expectation of the readers, and leave them disappointed. In this work, we attempt to automatically detect clickbaits and then build a browser extension which warns the readers of different media sites about the possibility of being baited by such headlines. The extension also offers each reader an option to block clickbaits she doesn't want to see. Then, using such reader choices, the extension automatically blocks similar clickbaits during her future visits. We run extensive offline and online experiments across multiple media sites and find that the proposed clickbait detection and the personalized blocking approaches perform very well achieving 93% accuracy in detecting and 89% accuracy in blocking clickbaits.

研究动机与目标

自动检测在线新闻媒体中通过夸张或煽动性语言误导读者的点击诱饵标题。
开发浏览器扩展，在用户遇到潜在点击诱饵时发出警告。
根据用户偏好，实现对不需要的点击诱饵内容的个性化阻止。
在多种新闻网站上评估检测和阻止机制的有效性。
通过减少用户接触未能兑现内容承诺的误导性标题，改善用户体验。

提出的方法

系统使用自然语言处理技术分析标题文本，检测与点击诱饵相关的语言线索，例如情感语言、好奇心缺口和夸张声明。
利用标注标题数据集训练机器学习模型，以分类标题是否为点击诱饵。
浏览器扩展集成检测模型，在用户浏览新闻网站时实时分析标题。
用户可标记或阻止特定点击诱饵，系统根据这些选择学习，并在后续访问中自动阻止类似标题。
系统采用反馈回路，根据个人用户偏好个性化阻止行为。
在多个媒体网站上开展离线和在线实验，以验证检测和阻止性能。

实验结果

研究问题

RQ1自动化系统在在线新闻媒体中检测点击诱饵标题的准确度如何？
RQ2浏览器扩展能否有效实时警告用户注意潜在误导性的点击诱饵标题？
RQ3基于用户反馈的个性化阻止在多大程度上可减少用户接触不必要的点击诱饵？
RQ4检测和阻止性能在不同新闻网站和内容类型中如何变化？
RQ5用户驱动的反馈在多大程度上能随时间提高点击诱饵阻止的准确性？

主要发现

点击诱饵检测系统在多个新闻网站上识别煽动性标题的准确率达到93%。
基于用户偏好学习并阻止未来访问中类似标题的个性化阻止机制，准确率达到89%。
浏览器扩展在实时浏览过程中成功警告用户潜在的点击诱饵。
用户驱动的反馈显著提升了系统随时间阻止不想要内容的能力。
系统在离线和在线评估设置下，于多种在线新闻媒体平台中表现一致。
自动化检测与用户个性化相结合，显著减少了用户接触误导性标题的暴露。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。