QUICK REVIEW

[论文解读] Neuron Shapley: Discovering the Responsible Neurons

Amirata Ghorbani, James Zou|arXiv (Cornell University)|Feb 23, 2020

Adversarial Robustness in Machine Learning参考文献 52被引用 61

一句话总结

Neuron Shapley 引入了一种 Shapley 值框架，用以量化深度网络中单个神经元的贡献，并考虑交互作用，且采用多臂赌博机（multi-armed bandit）方法实现高效估计。它能识别稀疏的关键滤波器，并在不重新训练的情况下实现用于公正性和鲁棒性的后训练模型修复。

ABSTRACT

We develop Neuron Shapley as a new framework to quantify the contribution of individual neurons to the prediction and performance of a deep network. By accounting for interactions across neurons, Neuron Shapley is more effective in identifying important filters compared to common approaches based on activation patterns. Interestingly, removing just 30 filters with the highest Shapley scores effectively destroys the prediction accuracy of Inception-v3 on ImageNet. Visualization of these few critical filters provides insights into how the network functions. Neuron Shapley is a flexible framework and can be applied to identify responsible neurons in many tasks. We illustrate additional applications of identifying filters that are responsible for biased prediction in facial recognition and filters that are vulnerable to adversarial attacks. Removing these filters is a quick way to repair models. Enabling all these applications is a new multi-arm bandit algorithm that we developed to efficiently estimate Neuron Shapley values.

研究动机与目标

量化每个神经元对网络性能的贡献，同时考虑神经元之间的交互。
开发一种高效的算法，在大规模网络中近似 Neuron Shapley 值。
证明少量神经元在跨任务的网络性能（如准确性、公平性和鲁棒性）中可以占主导地位。
表明移除少数关键神经元即可在不重新训练的情况下修复模型以提升公平性和鲁棒性。

提出的方法

将 Neuron Shapley 定义为通过在去除非 S 神经元后对网络性能的唯一公平分配 V(S) 来分配给神经元。
使用 Monte-Carlo Shapley 估计并引入新的 Truncated Multi-Armed Bandit (TMAB) 方法以高效识别前 k 个影响力最大的神经元。
引入提前截断和自适应采样，通过利用稀疏高影响力神经元来降低计算成本。
将 TMAB-Shapley 与 MC-Shapley 进行对比，显示显著的样本效率和高保真度（R2≈0.975，秩相关≈0.988）。
将该方法应用于 ImageNet 的 Inception-v3 和 CelebA 的 SqueezeNet，在准确性、公平性和对抗性脆弱性等任务上进行评估。
通过将罪魁祸首神经元置零来展示后训练模型修复，以在不重新训练的前提下提升公平性和鲁棒性。

实验结果

研究问题

RQ1Neuron Shapley 是否能够在捕捉神经元之间的交互的同时，准确量化单个神经元的贡献？
RQ2如何在大规模网络中高效近似 Shapley 值，以识别一组稀疏的关键神经元？
RQ3少量神经元是否驱动整体的准确性、公平性和鲁棒性，移除它们是否可以在不重新训练的情况下修复模型？
RQ4Neuron Shapley 能否揭示类别特定和层特异性的关键神经元，这些与可解释性有何关系？
RQ5Neuron Shapley 是否有效用于识别对偏见和对抗性脆弱性负责的神经元？

主要发现

一小组稀疏的神经元（滤波器）在很大程度上决定了跨任务的网络性能。
移除具有最高 Shapley 分数的前 30 个滤波器会显著破坏 Inception-v3 在 ImageNet 上的准确性。
将罪魁祸首置零在偏见易发任务中提升公平性，并在不造成较大准确性损失的情况下降低对抗性脆弱性。
Neuron Shapley 在识别对准确性、公平性和鲁棒性关键的滤波器方面，优于其他神经元重要性方法（如 Neuron Conductance）。
TMAB-Shapley 算法在样本量大幅减少约 10 倍的同时保持高准确性（R2≈0.975，秩相关≈0.988）。
存在类别特异的神经元，一些滤波器对特定类别至关重要，而在移除时对整体性能影响较小。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。