QUICK REVIEW

[论文解读] Towards Faithful Multimodal Concept Bottleneck Models

Pierre Moreau, Emeline Pineau Ferrand|arXiv (Cornell University)|Mar 13, 2026

Multimodal Machine Learning Applications被引用 0

一句话总结

f-CBM 引入泄漏感知训练和 Kolmogorov–Arnold Network 头，以构建忠实的多模态概念瓶颈模型，联合优化概念检测和泄漏降低，同时保持强任务精度。

ABSTRACT

Concept Bottleneck Models (CBMs) are interpretable models that route predictions through a layer of human-interpretable concepts. While widely studied in vision and, more recently, in NLP, CBMs remain largely unexplored in multimodal settings. For their explanations to be faithful, CBMs must satisfy two conditions: concepts must be properly detected, and concept representations must encode only their intended semantics, without smuggling extraneous task-relevant or inter-concept information into final predictions, a phenomenon known as leakage. Existing approaches treat concept detection and leakage mitigation as separate problems, and typically improve one at the expense of predictive accuracy. In this work, we introduce f-CBM, a faithful multimodal CBM framework built on a vision-language backbone that jointly targets both aspects through two complementary strategies: a differentiable leakage loss to mitigate leakage, and a Kolmogorov-Arnold Network prediction head that provides sufficient expressiveness to improve concept detection. Experiments demonstrate that f-CBM achieves the best trade-off between task accuracy, concept detection, and leakage reduction, while applying seamlessly to both image and text or text-only datasets, making it versatile across modalities.

研究动机与目标

通过确保概念被准确检测且不携带无意的任务或概念间信息来提升多模态 CBM 的忠实性。
提出一个基于统一 CLIP 的多模态 CBM 架构，处理图像和文本输入。
开发一个可微分的泄漏损失，在训练期间最小化任务泄漏。
引入 Kolmogorov–Arnold Network (KAN) 作为更具表现力但可解释的最终预测头。
证明该方法在跨模态的任务准确性、概念检测和泄漏降低之间实现有利权衡。

提出的方法

采用基于 CLIP 的视觉-语言骨干网络提取多模态表示，并送入概念瓶颈层（CBL）。
以泄漏感知目标函数进行训练，通过核密度估计（KDE）对任务泄漏的可微估计进行最小化。
用 Kolmogorov–Arnold Network (KAN) 替代标准线性头，以提高表达能力，同时保持概念到类别映射的可解释性。
联合优化分类损失、概念检测损失和泄漏损失，采用动态损失缩放和余弦调度以逐步激活泄漏正则化。
通过聚合模态特异的概念分数为每个样本生成统一的概念向量来对数据集进行多模态概念注释。

Figure 1 : Pareto frontier: concept detection accuracy versus aggregate leakage. The x-axis represents the average of task-related and inter-concept leakage as introduced in Section 2 , and the y-axis represents RMSE concept detection performance.

实验结果

研究问题

RQ1我们如何在多模态概念瓶颈模型中衡量和降低泄漏（任务泄漏和概念间泄漏）？
RQ2泄漏感知训练目标是否在不显著损害任务性能的前提下提升忠实性？
RQ3更具表现力且可解释的最终头（KAN）是否能提升概念检测和下游准确性？
RQ4在不同模态（图像、文本、图像-文本）下将 f-CBM 框架应用是否可行，而无需模态特定调整？
RQ5将泄漏缓解与 KAN 集成对概念激活质量和模型可解释性有何影响？

主要发现

f-CBM 在任务准确性、概念检测和泄漏降低之间实现了在不同数据集和骨干网络上的有利平衡。
泄漏感知训练目标在保持有竞争力的任务性能的同时显著降低了泄漏。
将线性头替换为 KAN 层可显著提升概念检测并减少任务泄漏和概念间泄漏，对准确性影响最小。
将泄漏损失整合到训练中在泄漏方面带来有意义的降低，并通过干预分析提升忠实性。
KAN 揭示了非线性概念到类别的映射，帮助将概念与最终预测解耦，同时保持可解释性。
该方法可推广到图像-文本、文本单模态和多模态设定，显示了在多模态中的通用性。

(a) Task leakage vs. concept detection accuracy. With $p$ as the $p$ -value of the one-tailed paired $t$ -test, *** $p<1$ %. "Low accuracy" stands for the reference baseline.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。