[论文解读] Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models
本文将上下文无关的短语重要性形式化为神经序列模型分层解释中的关键量,并引入两种算法,SCD 与 SOC,在揭示 LSTM 与 BERT 模型的组成语义方面优于以往方法。
The impressive performance of neural networks on natural language processing tasks attributes to their ability to model complicated word and phrase compositions. To explain how the model handles semantic compositions, we study hierarchical explanation of neural network predictions. We identify non-additivity and context independent importance attributions within hierarchies as two desirable properties for highlighting word and phrase compositions. We show some prior efforts on hierarchical explanations, e.g. contextual decomposition, do not satisfy the desired properties mathematically, leading to inconsistent explanation quality in different models. In this paper, we start by proposing a formal and general way to quantify the importance of each word and phrase. Following the formulation, we propose Sampling and Contextual Decomposition (SCD) algorithm and Sampling and Occlusion (SOC) algorithm. Human and metrics evaluation on both LSTM models and BERT Transformer models on multiple datasets show that our algorithms outperform prior hierarchical explanation algorithms. Our algorithms help to visualize semantic composition captured by models, extract classification rules and improve human trust of models. Project page: https://inklab.usc.edu/hiexpl/
研究动机与目标
- Motivate the need for hierarchical, non-additive explanations of semantic composition in neural sequence models.
- Propose a formal measure of context-independent phrase importance across N-context windows.
- Develop two algorithms (SCD and SOC) that operationalize the measure for practical explanations.
- Evaluate on LSTM and BERT across sentiment and relation extraction tasks, showing improvement over baselines.
- Demonstrate usefulness in visualization, rule extraction, and human trust in model predictions.
提出的方法
- Define N-context independent importance of a phrase as the expectation of prediction differences when the phrase is masked, averaged over surrounding contexts (Eq. 3/4).
- Identify non-additivity and context-independence as desirable properties for hierarchical explanations (Section 3.1).
- Propose Sampling and Contextual Decomposition (SCD) by modulating CD’s activation decomposition to satisfy context-independence (Eq. 5).
- Propose Sampling and Occlusion (SOC) as a simple, model-agnostic alternative using context sampling and phrase masking (Eq. 8).
- Implement context sampling via a pretrained bidirectional language model to generate surrounding contexts (Section 3.3, 3.4).
- Evaluate against baselines (Input Occlusion, Direct Feed, GradSHAP, CD, ACD) on SST-2, Yelp, and TACRED datasets (Section 4).
实验结果
研究问题
- RQ1How can we quantify the context-independent importance of phrases in neural sequence models?
- RQ2Do hierarchical explanations that respect non-additivity and context-independence provide more faithful visualizations of compositional semantics than prior methods?
- RQ3Do SCD and SOC provide better alignment with human judgments and ground-truth phrase-level annotations compared to CD/ACD and other baselines?
- RQ4Can these explanations assist in extracting classification rules and increasing human trust in model predictions?
- RQ5Are the proposed methods effective across both LSTM and Transformer architectures and across multiple NLP tasks?
主要发现
- Context-independent phrase importance can be quantified as an expectation over surrounding contexts of the prediction difference when masking the phrase (Eq. 3/4).
- SCD and SOC consistently outperform CD, ACD, and baselines on word/phrase correlation with ground-truth annotations across SST-2, Yelp, and TACRED.
- SOC and SCD achieve higher word ρ and phrase ρ scores than competitors, especially in Transformer models.
- Human evaluation shows SOC/SCD explanations yield higher trust in model predictions than GradSHAP, ACD, and CD in SST-2 and TACRED.
- The methods enable hierarchical visualization of compositional semantics, rule extraction, and improved interpretability without sacrificing predictive performance.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。