QUICK REVIEW

[论文解读] Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models

Eric Zhang, Kai Wang|arXiv (Cornell University)|Mar 30, 2023

Generative Adversarial Networks and Image Synthesis被引用 12

一句话总结

Forget-Me-Not 是一种即插即用的方法，通过注意力再引导，在文本到图像扩散模型中高效地遗忘目标概念，使用 Memorization Score (M-Score) 和 ConceptBench 进行评估。

ABSTRACT

The unlearning problem of deep learning models, once primarily an academic concern, has become a prevalent issue in the industry. The significant advances in text-to-image generation techniques have prompted global discussions on privacy, copyright, and safety, as numerous unauthorized personal IDs, content, artistic creations, and potentially harmful materials have been learned by these models and later utilized to generate and distribute uncontrolled content. To address this challenge, we propose extbf{Forget-Me-Not}, an efficient and low-cost solution designed to safely remove specified IDs, objects, or styles from a well-configured text-to-image model in as little as 30 seconds, without impairing its ability to generate other content. Alongside our method, we introduce the extbf{Memorization Score (M-Score)} and extbf{ConceptBench} to measure the models' capacity to generate general concepts, grouped into three primary categories: ID, object, and style. Using M-Score and ConceptBench, we demonstrate that Forget-Me-Not can effectively eliminate targeted concepts while maintaining the model's performance on other concepts. Furthermore, Forget-Me-Not offers two practical extensions: a) removal of potentially harmful or NSFW content, and b) enhancement of model accuracy, inclusion and diversity through extbf{concept correction and disentanglement}. It can also be adapted as a lightweight model patch for Stable Diffusion, allowing for concept manipulation and convenient distribution. To encourage future research in this critical area and promote the development of safe and inclusive generative models, we will open-source our code and ConceptBench at \href{https://github.com/SHI-Labs/Forget-Me-Not}{https://github.com/SHI-Labs/Forget-Me-Not}.

研究动机与目标

激励并界定文本到图像扩散模型中的概念遗忘，以应对隐私、安全和版权等关注点。
提供一种低成本、可插拔的解决方案，在不降低整体模型性能的情况下遗忘目标概念。
提出量化指标（Memorization Score）和基准（ConceptBench）来评估遗忘与记忆。
展示对概念纠正、解耦以及对如 Stable Diffusion 等扩散模型的轻量级模型补丁的扩展。

提出的方法

引入注意力再引导损失，最小化跨 UNet 交叉注意力层中与遗忘概念对应的交叉注意力映射。
仅微调交叉注意力或相关组件，以在不重新训练整个模型的情况下将注意力引导离开遗忘概念。
在遗忘提示词超出词汇表或不清晰时，可选使用 Concept Inversion 来获得精确的概念嵌入。
提供一种轻量级的可补丁方法，支持多概念遗忘并便于向用户分发。
定义 Memorization Score（基于文本反转中概念嵌入的余弦相似度变化）和 ConceptBench 作为评估工具。

实验结果

研究问题

RQ1在不影响无关内容的前提下，如何从文本到图像扩散模型中遗忘有针对性的概念？
RQ2一个轻量级、可插拔的方法是否能够实现多概念遗忘并扩展到概念纠正和解耦？
RQ3如何对扩散模型中的记忆与遗忘进行定量测量并系统性地进行基准评估？

主要发现

概念	初始记忆分数	遗忘记忆分数
Elon Musk	0.943	0.848
Mickey Mouse	0.948	0.836
Zebra	0.972	0.899
Google	0.940	0.811
Apple	0.696	0.493
Horse	0.877	0.808
Van Gogh	0.916	0.684

Forget-Me-Not 可以在对非目标概念影响最小的情况下移除目标概念，在某些情况下约在 30 秒内实现遗忘。
该方法在跨注意力层移除目标概念的同时，保持了对其他概念的模型完整性。
ConceptBench 将概念分为身份、对象和风格以评估遗忘与记忆。
Memorization Score 显示遗忘后概念嵌入相似度的下降（例如 Elon Musk 从 0.943 降至 0.848）。
该方法支持 NSFW 内容移除以及概念纠正/解耦，提供跨概念的定性与定量证据。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。