QUICK REVIEW

[논문 리뷰] PolySHAP: Extending KernelSHAP with Interaction-Informed Polynomial Regression

Fabian Fumagalli, R. Teal Witter|arXiv (Cornell University)|2026. 01. 26.

Explainable Artificial Intelligence (XAI)인용 수 0

한 줄 요약

PolySHAP은 Shapley 게임에 다항식 근사를 적용하여 특징 간 상호작용을 포착하고 Shapley 가치 추정치를 개선하며 이론적 일관성과 KernelSHAP에 대한 페어링 샘플링 연결성을 제공합니다.

ABSTRACT

Shapley values have emerged as a central game-theoretic tool in explainable AI (XAI). However, computing Shapley values exactly requires $2^d$ game evaluations for a model with $d$ features. Lundberg and Lee's KernelSHAP algorithm has emerged as a leading method for avoiding this exponential cost. KernelSHAP approximates Shapley values by approximating the game as a linear function, which is fit using a small number of game evaluations for random feature subsets. In this work, we extend KernelSHAP by approximating the game via higher degree polynomials, which capture non-linear interactions between features. Our resulting PolySHAP method yields empirically better Shapley value estimates for various benchmark datasets, and we prove that these estimates are consistent. Moreover, we connect our approach to paired sampling (antithetic sampling), a ubiquitous modification to KernelSHAP that improves empirical accuracy. We prove that paired sampling outputs exactly the same Shapley value approximations as second-order PolySHAP, without ever fitting a degree 2 polynomial. To the best of our knowledge, this finding provides the first strong theoretical justification for the excellent practical performance of the paired sampling heuristic.

연구 동기 및 목표

KernelSHAP의 선형 근사 이상으로 Shapley 가치 추정을 개선하기 위해 특징 간 상호작용을 반영합니다.
Shapley 게임을 근사하기 위해 선택된 상호작용을 포함하는 다항 회귀 프레임워크인 PolySHAP를 제안합니다.
PolySHAP 추정치의 true Shapley 값에 대한 일관성 등 이론적 보장을 제공합니다.
쌍 KernelSHAP를 이차 PolySHAP와 연결하여 실험적 정확도 향상을 설명합니다.

제안 방법

설명 게임을 특징 포함의 이진 지시자에서의 다항식으로 표현하고 상호작용 프런티어 I를 사용합니다.
게임을 근사하기 위해 d' = d + |I| 계수에 대한 규제 최소제곱 문제를 풉니다(식(4)).
Shapley 값은 PolySHAP 계수를 매핑을 통해 복구될 수 있음을 보입니다(정리 4.3).
레이버리지-점수 기반 샘플링을 사용하여 게임 평가를 선택하고 회귀 설계 행렬의 효율적 구성을 논의합니다(섹션 4.3).
예산 m에 대해 모델 복잡성을 제어하기 위해 k-추가 및 부분 상호작용 프런티어를 정의합니다(섹션 4.4).
쌍 KernelSHAP와 이차 PolySHAP 간의 이론적 등가성을 확립합니다(정리 5.1).

Figure 1: Both KernelSHAP and PolySHAP fit a function to approximate a sample of game evaluations. While KernelSHAP uses a linear approximation, PolySHAP uses a more expressive polynomial approximation. Finally, both algorithms return the Shapley values (SV) of their respective approximations (trivi

실험 결과

연구 질문

RQ1다항식 근사에서 높은 차수의 상호작용 항이 KernelSHAP보다 Shapley 가치 추정치를 개선할 수 있는가?
RQ2PolySHAP를 사용하여 Shapley 값을 복구할 때의 이론적 보장(일관성)은 어떻게 되는가?
RQ3쌍 샘플링이 PolySHAP와 어떤 관련이 있으며 상호작용 구조 전반에서 정확도 향상을 설명하는가?
RQ4샘플링 예산을 고려할 때 상호작용 프런티어를 어떻게 선택해야 하는가?
RQ5실용적 변형(쌍 샘플링, 부분 프런티어)이 표 형태, 이미지, 텍스트 등 데이터 도메인에서 성능에 어떤 영향을 미치는가?

주요 결과

PolySHAP은 벤치마크 데이터셋에서 KernelSHAP 및 다른 기준선보다 더 정확한 Shapley 가치 추정치를 제공합니다.
PolySHAP 표현은 샘플 예산 m이 2^d에 가까워질수록 해석적으로 true Shapley 값으로 수렴합니다(일관성).
쌍 KernelSHAP은 이차 PolySHAP와 정확히 동일한 Shapley 근사를 산출하므로 쌍 샘플링의 실용적 이득에 대한 이론적 근거를 제공합니다.
PolySHAP에 고차 상호작용을 포함하면 특히 쌍 샘플링을 사용할 때 근사 품질이 향상됩니다; 예산이 허용하는 경우 부분 포함도 상당한 이득을 제공합니다.
k-PolySHAP은 순서 k까지 Faith-SHAP에 해당하고 1-PolySHAP은 KernelSHAP로 축소됩니다(특수한 경우).
실험적 결과는 매우 높은 차원에서 수익이 감소하는 경향이 있으며 예산이 더 많은 상호작용 모델링을 지원해야 한다는 점을 시사합니다.

Figure 2: Approximation quality measured by MSE ( $\pm$ SEM) for various sampling budgets $m$ on different games. Adding any number of interactions in PolySHAP improves approximation quality.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.