[论文解读] Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
对多模态大语言模型(MLLMs)的可解释性与解释性的综合综述,按数据、模型与训练/推理三个视角组织方法,并概述未来的研究方向。
The rapid development of Artificial Intelligence (AI) has revolutionized numerous fields, with large language models (LLMs) and computer vision (CV) systems driving advancements in natural language understanding and visual processing, respectively. The convergence of these technologies has catalyzed the rise of multimodal AI, enabling richer, cross-modal understanding that spans text, vision, audio, and video modalities. Multimodal large language models (MLLMs), in particular, have emerged as a powerful framework, demonstrating impressive capabilities in tasks like image-text generation, visual question answering, and cross-modal retrieval. Despite these advancements, the complexity and scale of MLLMs introduce significant challenges in interpretability and explainability, essential for establishing transparency, trustworthiness, and reliability in high-stakes applications. This paper provides a comprehensive survey on the interpretability and explainability of MLLMs, proposing a novel framework that categorizes existing research across three perspectives: (I) Data, (II) Model, (III) Training \& Inference. We systematically analyze interpretability from token-level to embedding-level representations, assess approaches related to both architecture analysis and design, and explore training and inference strategies that enhance transparency. By comparing various methodologies, we identify their strengths and limitations and propose future research directions to address unresolved challenges in multimodal explainability. This survey offers a foundational resource for advancing interpretability and transparency in MLLMs, guiding researchers and practitioners toward developing more accountable and robust multimodal AI systems.
研究动机与目标
- 在数据、模型,以及训练与推理这三个视角上,对MLLMs的可解释性与解释性现有研究进行调查与分类。
- 从令牌级到嵌入级表示,以及跨架构设计与分析,分析可解释性。
- 识别优点、局限性及未来研究方向,以提高MLLMs的透明度和鲁棒性。
提出的方法
- 对2010–2024年关于MLLMs可解释性与解释性的论文进行文献综述。
- 提出新颖的三视角框架:数据、模型、训练与推理,用于对方法进行分类。
- 包含输入–输出分析、嵌入、神经元、层次与架构概念的分类法与基准测试讨论。
- 对方法进行比较,识别优点与局限性。
- 讨论未来研究方向。
实验结果
研究问题
- RQ1如何在数据、模型及训练/推理维度上解释与理解MLLMs?
- RQ2哪些令牌级、嵌入级、神经元级和层级的洞见能够揭示MLLMs的跨模态决策过程?
- RQ3在多模态可解释性和鲁棒性方面,关键基准、数据集和评估框架是什么?
- RQ4当前方法的主要局限性是什么,以及提升MLLMs透明度和信任度的有前景方向?
主要发现
- MLLMs的可解释性最好通过三重视角来理解:数据、模型,以及训练与推理。
- 可解释性方法涵盖输入-输出分析、嵌入/表示分析、神经元/层级研究,以及架构设计方法。
- 基准测试和评估框架正在发展,以评估对齐性、鲁棒性以及在多模态任务中的领域特定可解释性。
- 在方法中已识别出优点与局限性,为未来研究指向更透明和更可信的MLLMs提供指引。
- 该综述为研究人员和从业者提供了一个结构化资源,以推动多模态AI系统的可解释性进展。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。