QUICK REVIEW

[论文解读] CiteFusion: An Ensemble Framework for Citation Intent Classification Harnessing Dual-Model Binary Couples and SHAP Analyses

Lorenzo Paolini, Sahar Vahdati|arXiv (Cornell University)|Mar 7, 2025

scientometrics and bibliometrics research被引用 1

一句话总结

CiteFusion 是一种用于引用意图分类的集成框架，通过 one-vs-all 二分类分解和前馈神经网络元分类器，将 SciBERT 和 XLNet 模型相结合，在 SciCite 上实现 89.60% 的 Macro-F1，在 ACL-ARC 上实现 76.24% 的 Macro-F1，达到最先进性能，同时通过 SHAP 分析和章节标题整合提升了可解释性。

ABSTRACT

Understanding the motivations underlying scholarly citations is critical for evaluating research impact and fostering transparent scholarly communication. This study introduces CiteFusion, an ensemble framework designed to address the multiclass Citation Intent Classification (CIC) task on benchmark datasets, SciCite and ACL-ARC. The framework decomposes the task into binary classification subtasks, utilizing complementary pairs of SciBERT and XLNet models fine-tuned independently for each citation intent. These base models are aggregated through a feedforward neural network meta-classifier, ensuring robust performance in imbalanced and data-scarce scenarios. To enhance interpretability, SHAP (SHapley Additive exPlanations) is employed to analyze token-level contributions and interactions among base models, providing transparency into classification dynamics. We further investigate the semantic role of structural context by incorporating section titles into input sentences, demonstrating their significant impact on classification accuracy and model reliability. Experimental results show that CiteFusion achieves state-of-the-art performance, with Macro-F1 scores of 89.60% on SciCite and 76.24% on ACL-ARC. The original intents from both datasets are mapped to Citation Typing Ontology (CiTO) object properties to ensure interoperability and reusability. This mapping highlights overlaps between the two datasets labels, enhancing their understandability and reusability. Finally, we release a web-based application that classifies citation intents leveraging CiteFusion models developed on SciCite.

研究动机与目标

为解决在数据不平衡且资源稀缺的学术数据集中进行多类别引用意图分类的挑战。
通过在集成框架中结合领域特定（SciBERT）和通用（XLNet）语言模型，提升分类的鲁棒性和可解释性。
利用 SHAP 分析提升模型透明度，实现基于 token 的贡献度分析和模型交互洞察。
探究结构化上下文（尤其是章节标题）作为语境框架在提升分类准确率方面的影响。
通过引用类型本体（Citation Typing Ontology, CiTO）对数据集中的引用意图标签进行标准化，以实现互操作性和可重用性。

提出的方法

将多类别引用意图分类任务分解为每个意图类别对应的多个 one-vs-all（OVA）二分类子任务。
采用 SciBERT 和 XLNet 的互补组合，分别在每个二分类子任务上独立微调，以捕捉科学语言和通用语言模式。
将所有基模型输出的正类概率聚合为一个特征向量，输入前馈神经网络（FFNN）元分类器。
在输入句子中整合章节标题作为语境框架，以增强模型对引用上下文的理解。
应用混合精度训练和计算不稳定性分析，以减少训练时间与资源消耗，同时缓解过拟合。
利用 SHAP（SHapley Additive exPlanations）分析特征重要性与模型交互动态，实现可解释性与错误分析。

实验结果

研究问题

RQ1在数据不平衡且资源稀缺的环境下，结合领域特定与通用语言模型的集成方法是否能提升引用意图分类的性能？
RQ2将章节标题作为上下文框架因素引入，对引用意图分类的准确率有何影响？
RQ3SHAP 分析在多大程度上能增强可解释性，并揭示模型层面的动态特征与误分类模式？
RQ4SciCite 和 ACL-ARC 中的引用意图类别与 Citation Typing Ontology（CiTO）等标准化本体如何对齐？
RQ5数据质量和数量对模型性能的影响如何，特别是在 SciCite 中的 Result 类和 ACL-ARC 中的 Extends、Motivation 与 Future 类等低资源类别上？

主要发现

CiteFusion 在 SciCite 数据集上达到最先进性能，Macro-F1 得分为 89.60%，在 ACL-ARC 数据集上为 76.24%。
将章节标题作为语境框架显著提升了分类准确率，证明其作为结构化上下文线索具有重要价值。
SHAP 分析揭示了不同的 token 级贡献度，并突出显示了模型特异性特征，有助于识别误分类模式与模型行为。
该框架成功将两个数据集中的引用意图标签映射到 Citation Typing Ontology（CiTO）中的标准化对象属性，实现了跨数据集比较与互操作性。
在低资源类别（如 SciCite 中的 Result 和 ACL-ARC 中的 Extends、Motivation、Future）上观察到性能下降，凸显了对数据数量与质量的依赖性。
公开发布代码、模型（含与不含章节标题的版本）以及基于 Web 的应用程序，支持可复现性，并推动在文献计量与学术交流系统中的实际部署。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。