QUICK REVIEW

[논문 리뷰] Defining Locality for Surrogates in Post-hoc Interpretablity

Thibault Laugel, Xavier Renard|arXiv (Cornell University)|2018. 06. 19.

Explainable Artificial Intelligence (XAI)참고 문헌 5인용 수 46

한 줄 요약

이 논문은 로컬 대리 설명의 충실도에 현지 샘플링의 위치성이 결정적으로 영향을 미친다는 것을 보이고, 의사 결정 경계 주변에서 Local Surrogate(LS) 샘플링을 도입하여 여러 데이터 세트에서 LIME보다 로컬 설명의 충실도를 향상시킨다.

ABSTRACT

Local surrogate models, to approximate the local decision boundary of a black-box classifier, constitute one approach to generate explanations for the rationale behind an individual prediction made by the back-box. This paper highlights the importance of defining the right locality, the neighborhood on which a local surrogate is trained, in order to approximate accurately the local black-box decision boundary. Unfortunately, as shown in this paper, this issue is not only a parameter or sampling distribution challenge and has a major impact on the relevance and quality of the approximation of the local black-box decision boundary and thus on the meaning and accuracy of the generated explanation. To overcome the identified problems, quantified with an adapted measure and procedure, we propose to generate surrogate-based explanations for individual predictions based on a sampling centered on particular place of the decision boundary, relevant for the prediction to be explained, rather than on the prediction itself as it is classically done. We evaluate the novel approach compared to state-of-the-art methods and a straightforward improvement thereof on four UCI datasets.

연구 동기 및 목표

개별 예측에 대한 로컬 대리 설명에서 위치성이 중요한 이유를 제시한다.
표준 샘플링(LIME에서와 같은)이 로컬에서 영향을 받는 특징들을 가리게 할 수 있음을 보여준다.
로컬 충실도를 개선하기 위해 의사 결정 경계에 초점을 맞춘 샘플링 전략을 제안한다.
합성 및 실제 데이터 세트에서 LS가 LIME보다 로컬 충실도가 향상되는 것을 보여준다.

제안 방법

로컬 대리의 세 가지 단계 프로세스를 설명한다: 샘플 학습 공간, 해석 가능한 대리 설명 적합, 설명 추출.
LIME의 샘플링 및 가중치 체계를 분석하고 그것이 로컬 특성보다 전역 특성을 강조할 수 있음을 보여준다.
로컬 대리(LS)를 도입한다: GrowingSpheres를 사용해 가장 가까운 결정 경계 지점을 탐지한 후, 그 경계 주위에 샘플링하여 대리를 학습한다.
로컬 충실도를 x를 중심으로 한 로컬 이웃의 반지름 r_fid를 사용하여 대리 모델 b(x)와 s_x의 정확도로 정의한다.
합성 데이터(하프 문)와 네 가지 UCI 데이터 세트를 대상으로 Local Fidelity를 평가 지표로 삼아 LS와 LIME, LIME-K를 비교한다.
리포트에 따르면 LS가 데이터 세트 전반에서 더 높은 Local Fidelity를 달성한다.

실험 결과

연구 질문

RQ1로컬 대리를 학습시키는 데 사용되는 이웃 정의가 흑상자 결정 경계에 대한 근사 정확도에 어떤 영향을 미치는가?
RQ2경계 근처에서 직접 샘플링하는 것이 전역적 또는 인스턴스 중심 샘플링에 비해 로컬 설명의 충실도를 향상시킬 수 있는가?
RQ3경계 중심 샘플링으로 학습된 로컬 대리가 다양한 데이터세트에서 표준 LIME 기반 접근법보다 우수한가?

주요 결과

Dataset	LIME	LIME-K	LS
1/2 moons	0.89 (0.07)	0.96 (0.06)	0.97 (0.03)
cancer	0.86 (0.07)	0.87 (0.07)	0.96 (0.02)
credit	0.67 (0.21)	0.70 (0.18)	0.85 (0.12)
news	0.64 (0.10)	0.67 (0.10)	0.79 (0.07)
tennis	0.85 (0.12)	0.83 (0.13)	0.98 (0.02)

로컬 특성에 초점을 둔 샘플링이 없으면 대리 설명의 로컬 충실도가 감소한다.
LIME의 표준 샘플링은 전역 패턴과 일치하는 결정 경계를 만들어 로컬 경계가 아닌 전역 경계에 맞출 수 있다.
경계 중심 샘플링 전략(LS)은 로컬 충실도를 개선하고 더 충실한 로컬 설명을 제공한다.
데이터 세트 전반에 걸쳐 LS는 LIME 및 LIME-K보다 평균 로컬 충실도(AUC)가 더 높고 분산이 작다.
보고된 표에서 LS는 로컬 충실도 측면에서 일관되게 대안들보다 우수하다.
본 접근법은 하프문 및 네 가지 UCI 데이터 세트(Breast Cancer, Default of Credit Card Clients, Online News Popularity, Tennis)에서 검증되었다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.