QUICK REVIEW

[논문 리뷰] Editable XAI: Toward Bidirectional Human-AI Alignment with Co-Editable Explanations of Interpretable Attributes

Haoyang Chen, Jingwen Bai|arXiv (Cornell University)|2026. 02. 13.

Explainable Artificial Intelligence (XAI)인용 수 0

한 줄 요약

The paper introduces Editable XAI and CoExplain, a neurosymbolic framework that allows users to read, write, and enhance rule-based explanations to achieve bidirectional human-AI alignment with neural networks explained by decision trees.

ABSTRACT

While Explainable AI (XAI) helps users understand AI decisions, misalignment in domain knowledge can lead to disagreement. This inconsistency hinders understanding, and because explanations are often read-only, users lack the control to improve alignment. We propose making XAI editable, allowing users to write rules to improve control and gain deeper understanding through the generation effect of active learning. We developed CoExplain, leveraging a neural network for universal representation and symbolic rules for intuitive reasoning on interpretable attributes. CoExplain explains the neural network with a faithful proxy decision tree, parses user-written rules as an equivalent neural network graph, and collaboratively optimizes the decision tree. In a user study (N=43), CoExplain and manually editable XAI improved user understanding and model alignment compared to read-only XAI. CoExplain was easier to use with fewer edits and less time. This work contributes Editable XAI for bidirectional AI alignment, improving understanding and control.

연구 동기 및 목표

Motivate and define Editable XAI to overcome read-only explanations and domain misalignment between users and AI.
Identify design requirements from user elicitation studies for editable explanations.
Develop CoExplain to support read, write, and enhance interactions through neurosymbolic methods.
Demonstrate that editable explanations improve user understanding and model alignment compared with non-editable baselines.

제안 방법

Use neural networks as the underlying predictive model for expressivity and trainability.
Explain the model with a faithful proxy decision tree through distillation (Read mode).
Parse user-written decision-tree rules into an equivalent neural network (Write mode).
Enhance explanations by auto-tuning thresholds and reorganizing tree topology via training with regularization (Enhance mode).
Maintain equivalence between decision trees and neural networks via distillation, parsing, backpropagation, and regularization.
Evaluate with a user study comparing CoExplain to Read-only XAI and manually editable XAI.

실험 결과

연구 질문

RQ1How can explanations be made editable to improve alignment between user domain knowledge and AI reasoning?
RQ2Can a neurosymbolic framework support bidirectional interaction (read, write, enhance) between users and AI?
RQ3Do editable explanations improve user understanding and model alignment compared with traditional read-only explanations?
RQ4What design guidelines emerge from user studies for Editable XAI systems?
RQ5What mechanisms best preserve user intent while improving model performance through automated enhancements?

주요 결과

Editable explanations improve user understanding and model alignment compared with read-only explanations in a user study (N=43).
Participants preferred user-written rules that reflected their domain knowledge, even when not always the most accurate.
CoExplain balances aligning with initial user knowledge and achieving near-optimal model performance through AI-assisted enhancements.
On-demand enhancements reduce editing effort by refining thresholds and topology, saving users time.
Users viewed the collaborative, bidirectional nature of read/write/enhance as easier to use and more engaging than static explanations.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.