QUICK REVIEW

[论文解读] Trust Oriented Explainable AI for Fake News Detection

Krzysztof Siwek, Daniel Adam Stankowski|arXiv (Cornell University)|Mar 12, 2026

Explainable Artificial Intelligence (XAI)被引用 0

一句话总结

该论文比较了 SHAP、LIME 和 Integrated Gradients 在解释基于 NLP 的假新闻检测器（LSTM 和 CNN）中的作用，并展示 XAI 提高透明度的同时保持高准确性；每种方法提供独特的解释价值并存在局限性。

ABSTRACT

This article examines the application of Explainable Artificial Intelligence (XAI) in NLP based fake news detection and compares selected interpretability methods. The work outlines key aspects of disinformation, neural network architectures, and XAI techniques, with a focus on SHAP, LIME, and Integrated Gradients. In the experimental study, classification models were implemented and interpreted using these methods. The results show that XAI enhances model transparency and interpretability while maintaining high detection accuracy. Each method provides distinct explanatory value: SHAP offers detailed local attributions, LIME provides simple and intuitive explanations, and Integrated Gradients performs efficiently with convolutional models. The study also highlights limitations such as computational cost and sensitivity to parameterization. Overall, the findings demonstrate that integrating XAI with NLP is an effective approach to improving the reliability and trustworthiness of fake news detection systems.

研究动机与目标

需要在 NLP 基于假新闻检测中实现透明度的动机并减少对“黑箱”模型的依赖。
在假新闻分类背景下比较选定的 XAI 方法（SHAP、LIME、Integrated Gradients）。
评估 XAI 的实际利弊（计算成本、参数敏感性、误解风险）。
为将来将 XAI 与 NLP 架构和终端用户研究整合打下基础。

提出的方法

使用 ISOT 数据集构建完整的 NLP 假新闻检测流水线，标签为真新闻和假新闻。
评估两种神经网络结构：LSTM 和 CNN，含嵌入层 + 分类层。
整合并比较 XAI 方法（SHAP、LIME、Integrated Gradients）以实现逐词解释。
使用特征归因度量和可视化来评估解释质量与保真度。
给出面向特定模型的分析与可视化，说明解释如何反映架构行为。

Figure 2: SHAP visualization for CNN model

实验结果

研究问题

RQ1SHAP、LIME 与 Integrated Gradients 作为 NLP 模型假新闻检测的解释效果如何？
RQ2LSTM 和 CNN 架构的解释质量有何差异？
RQ3在该场景下使用 XAI 的实际局限性与风险是什么？
RQ4是否应对 XAI 输出的评估进行模型和架构特定化，以避免误导性的泛化？

主要发现

XAI 在普遍提高透明度的同时保持了较高的分类准确性。
SHAP 提供详细的局部归因；LIME 提供简单、直观的解释；Integrated Gradients（IG）在 CNN 上高效。
对于 LSTM，SHAP 产生了最强的去除效应和最高的 AOPC；LIME 表现良好但在完备性上略低；IG 较慢且效果较差。
对于 CNN，IG 在完备性与 AOPC 的综合表现最佳且性能较快；LIME 位于中间；SHAP 在充足性和 AOPC 上落后。
解释质量因架构而异；没有单一方法可普遍最优；建议进行模型特异性评估。

Figure 3: LIME visualization for CNN model

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。