QUICK REVIEW

[论文解读] S-Prompts Learning with Pre-trained Transformers: An Occam's Razor for Domain Incremental Learning

Yabin Wang, Zhiwu Huang|arXiv (Cornell University)|Jul 26, 2022

Domain Adaptation and Few-Shot Learning被引用 60

一句话总结

S-Prompts 引入独立的领域特定提示，配合预训练变换器，以解决无示例的领域增量学习，实现强域分离并降低遗忘，提供两种实现：基于图像的提示（ViT）和语言-图像提示（CLIP）。

ABSTRACT

State-of-the-art deep neural networks are still struggling to address the catastrophic forgetting problem in continual learning. In this paper, we propose one simple paradigm (named as S-Prompting) and two concrete approaches to highly reduce the forgetting degree in one of the most typical continual learning scenarios, i.e., domain increment learning (DIL). The key idea of the paradigm is to learn prompts independently across domains with pre-trained transformers, avoiding the use of exemplars that commonly appear in conventional methods. This results in a win-win game where the prompting can achieve the best for each domain. The independent prompting across domains only requests one single cross-entropy loss for training and one simple K-NN operation as a domain identifier for inference. The learning paradigm derives an image prompt learning approach and a novel language-image prompt learning approach. Owning an excellent scalability (0.03% parameter increase per domain), the best of our approaches achieves a remarkable relative improvement (an average of about 30%) over the best of the state-of-the-art exemplar-free methods for three standard DIL tasks, and even surpasses the best of them relatively by about 6% in average when they use exemplars. Source code is available at \url{https://github.com/iamwangyabin/S-Prompts}.

研究动机与目标

在领域增量学习（DIL）中不存储示例来解决灾难性遗忘。
提出一个简单范式（S-Prompts），让每个领域独立学习提示以最大化领域特定性能。
演示两种实现（在 ViT 上的 S-iPrompts 和在 CLIP 上的 S-liPrompts），并具备可扩展的提示池。

提出的方法

固定一个预训练的变换器，并将领域特定提示独立地为每个领域学习，进入不断增长的提示池。
使用简单的交叉熵损失来训练领域提示，在推理阶段进行 K-Means/K-NN 的领域识别。
对于 S-iPrompts，为图像令牌附加独立的图像提示，并训练一个每领域的全连接分类器。
对于 S-liPrompts，附加联合的图像和语言提示；使用带有每领域语言提示的 CLIP 风格文本编码器，以及基于 CLIP 的领域特定分类器。

实验结果

研究问题

RQ1通过分别对每个领域提示而非共用提示，无示例的 DIL 是否能够达到竞争性或更优的性能？
RQ2在多领域情境下，图像仅提示与语言-图像提示策略在准确性、遗忘与可扩展性方面如何比较？
RQ3简单的领域标识符（K-Means/K-NN）在推理阶段是否足以实现有效的领域路由？
RQ4随着领域数量增加，S-Prompts 的内存与计算开销是多少？
RQ5S-Prompts 对未见领域或域外数据的泛化能力如何？

主要发现

S-Prompts 在三个标准 DIL 基准测试上显著优于无示例基线（前向准确度平均相对提升约 30%）。
与竞争的无示例方法相比，S-Prompts 显著降低遗忘（平均遗忘提升约 13–41 点）。
使用 CLIP 基于提示的 S-liPrompts 在 DomainNet 上甚至可超越有示例的方法，并对未见领域显示出强泛化。
CLIP 的语言-图像提示方案（S-liPrompts）在仅每领域约增加 0.03% 参数的情况下实现可扩展的领域增长。
即使推理阶段领域识别不完美，S-Prompts 仍保持竞争力甚至更好。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。