QUICK REVIEW

[Paper Review] Usable XAI: 10 Strategies Towards Exploiting Explainability in the LLM Era

Xuansheng Wu, Haiyan Zhao|arXiv (Cornell University)|Mar 13, 2024

Scientific Computing and Data Management18 citations

TL;DR

The paper defines Usable XAI for LLMs, proposing 10 strategies to use explanations to improve LLMs and to let LLMs enhance XAI, supported by case studies and open-source code.

ABSTRACT

Explainable AI (XAI) refers to techniques that provide human-understandable insights into the workings of AI models. Recently, the focus of XAI is being extended toward explaining Large Language Models (LLMs). This extension calls for a significant transformation in the XAI methodologies for two reasons. First, many existing XAI methods cannot be directly applied to LLMs due to their complexity and advanced capabilities. Second, as LLMs are increasingly deployed in diverse applications, the role of XAI shifts from merely opening the ``black box'' to actively enhancing the productivity and applicability of LLMs in real-world settings. Meanwhile, the conversation and generation abilities of LLMs can reciprocally enhance XAI. Therefore, in this paper, we introduce Usable XAI in the context of LLMs by analyzing (1) how XAI can explain and improve LLM-based AI systems and (2) how XAI techniques can be improved by using LLMs. We introduce 10 strategies, introducing the key techniques for each and discussing their associated challenges. We also provide case studies to demonstrate how to obtain and leverage explanations. The code used in this paper can be found at: https://github.com/JacksonWuxs/UsableXAI_LLM.

Motivation & Objective

Define Usable XAI in the context of LLMs and distinguish two directions: using explanations to improve LLMs/AI systems and using LLMs to improve XAI frameworks.
Propose 10 strategies organized into two categories: Usable XAI for LLMs and LLM for Usable XAI.
Provide case studies demonstrating key techniques and discuss open challenges and future directions.
Release open-source code to foster applying explanations in the LLM context.
Survey and synthesize attribution, component interpretation, prompt engineering, knowledge augmentation, data augmentation, user-friendly explanations, and system design for XAI with LLMs.

Proposed method

Review attribution methods and assess their suitability for LLMs and generation tasks.
Analyze LLM internals (self-attention and feed-forward modules) for interpretability.
Develop sample-based explanations and EK-FAC-style influence estimation for debugging.
Examine explainability for trustworthiness (security, privacy, fairness, toxicity, truthfulness) and human alignment.
Explore explainable prompting (chain-of-thought and extensions) and knowledge-augmented prompting.
Discuss data augmentation with explanations and explanation-guided data enrichment.
Design user-friendly explanations with LLMs and automate interpretable AI workflows with LLM agents.
Consider LLMs for emulating human annotators and feedback in XAI training and evaluation.
Provide case studies and open-source code to illustrate practical usability.

Figure 1: The contributions and outline of this paper. We define Usable XAI in the context of LLMs with seven strategies of enhancing LLMs with XAI, and three strategies of enhancing XAI with LLMs.

Experimental results

Research questions

RQ1How can XAI explanations be used to diagnose, debug, and improve LLMs and broader AI systems?
RQ2How can LLMs contribute to advancing XAI frameworks and increase the usability of explanations for practitioners?
RQ3What practical techniques (attribution, component interpretation, sample-based explanation, prompting, knowledge augmentation) prove effective in LLM contexts?
RQ4What are the key challenges and future directions in making XAI usable in the LLM era?

Key findings

Attribution-based explanations can be used to evaluate LLM response quality and detect hallucinations, with empirical results showing competitive performance to baselines in some settings.
Interpreting LLM components (self-attention and feed-forward modules) yields insights for model design and prompting strategies.
Explainable prompting (chain-of-thought and knowledge-augmented prompts) can influence inference and decision-making controllability, with reported case-study observations.
Data augmentation and training data enrichment guided by explanations can mitigate shortcuts and align models with human preferences.
LLMs can enhance XAI usability by generating user-friendly explanations, automating interpretable AI workflows, and enabling evaluation through human-like cognition emulation.
The work provides open-source code to enable replication and further development.

Figure 3: A general pipeline of model diagnosis with attribution explanations.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.