[论文解读] Asymmetric Shapley values: incorporating causal knowledge into model-agnostic explainability
引入非对称Shapley值(ASVs),将因果知识融入对模型无关的解释,放宽对称性以实现因果感知的、序列化的、公平性分析和特征选择分析,而无需完整的因果图。
Explaining AI systems is fundamental both to the development of high performing models and to the trust placed in them by their users. The Shapley framework for explainability has strength in its general applicability combined with its precise, rigorous foundation: it provides a common, model-agnostic language for AI explainability and uniquely satisfies a set of intuitive mathematical axioms. However, Shapley values are too restrictive in one significant regard: they ignore all causal structure in the data. We introduce a less restrictive framework, Asymmetric Shapley values (ASVs), which are rigorously founded on a set of axioms, applicable to any AI system, and flexible enough to incorporate any causal structure known to be respected by the data. We demonstrate that ASVs can (i) improve model explanations by incorporating causal information, (ii) provide an unambiguous test for unfair discrimination in model predictions, (iii) enable sequentially incremental explanations in time-series models, and (iv) support feature-selection studies without the need for model retraining.
研究动机与目标
- Motivate the need to incorporate causal structure into model explainability beyond standard Shapley values.
- Propose a mathematically axiomatic framework (ASVs) that relaxes symmetry to leverage causal information.
- Demonstrate practical applications: causal-aware explanations, fairness testing via unresolved discrimination, time-series sequential explanations, and feature-selection without retraining.
提出的方法
- Define Asymmetric Shapley values with respect to a weighting over feature-order permutations w(π).
- Show that ASVs satisfy efficiency, linearity, and nullity axioms but not symmetry, enabling causal-informed attributions.
- Use on-manifold value functions to respect data correlations when computing attributions.
- Present distal and proximate causal-ordering strategies to encode known causal structure into explanations.
- Demonstrate implementation across four applications with empirical demonstrations on Census data, synthetic admissions data, EEG time-series, and feature-selection scenarios.
实验结果
研究问题
- RQ1How can Shapley-based explanations be extended to incorporate partial or full causal knowledge without requiring a complete causal graph?
- RQ2Can ASVs detect and quantify causal notions of unfairness (unresolved discrimination) in model predictions?
- RQ3Do ASVs yield sequence-aware explanations for time-series data and enable sparse, começo-focused attributions?
- RQ4Can ASVs provide a precise interpretation of feature usefulness for subset-based feature-selection without retraining models?
主要发现
- ASVs provide explanations that respect known causal orderings and can attribute model accuracy to causal ancestors before descendants.
- ASVs can reveal subtle fairness issues by measuring the incremental effect of sensitive attributes after accounting for resolving variables.
- ASVs produce sparser, time-sequence aware attributions that concentrate importance early in time-series data, unlike standard Shapley values.
- ASVs can quantify the accuracy achievable using a subset of features, supporting feature-selection without retraining multiple models.
- Empirical examples show distinct attributions when incorporating distal (root) vs proximate (immediate) causal notions.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。