QUICK REVIEW

[论文解读] DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises

Jiasheng Ye, Zaixiang Zheng|arXiv (Cornell University)|Feb 20, 2023

Music and Audio Processing被引用 9

一句话总结

DiNoiSer 通过噪声尺度裁剪和条件增强扩散采样，在多语言基准上提升扩散式条件序列学习，带来更好的翻译与文本生成质量。

ABSTRACT

While diffusion models have achieved great success in generating continuous signals such as images and audio, it remains elusive for diffusion models in learning discrete sequence data like natural languages. Although recent advances circumvent this challenge of discreteness by embedding discrete tokens as continuous surrogates, they still fall short of satisfactory generation quality. To understand this, we first dive deep into the denoised training protocol of diffusion-based sequence generative models and determine their three severe problems, i.e., 1) failing to learn, 2) lack of scalability, and 3) neglecting source conditions. We argue that these problems can be boiled down to the pitfall of the not completely eliminated discreteness in the embedding space, and the scale of noises is decisive herein. In this paper, we introduce DINOISER to facilitate diffusion models for sequence generation by manipulating noises. We propose to adaptively determine the range of sampled noise scales for counter-discreteness training; and encourage the proposed diffused sequence learner to leverage source conditions with amplified noise scales during inference. Experiments show that DINOISER enables consistent improvement over the baselines of previous diffusion-based sequence generative models on several conditional sequence modeling benchmarks thanks to both effective training and inference strategies. Analyses further verify that DINOISER can make better use of source conditions to govern its generative process.

研究动机与目标

识别扩散模型在离散序列学习中的关键限制（离散性陷阱、可扩展性和对源条件的低利用率）。
开发训练与推理策略，通过自适应噪声尺度操作来缓解离散性。
在多项条件序列任务（机器翻译、文本简化、改写）上展示相较基线的性能提升。
分析噪声尺度如何影响对源条件的依赖与生成质量。

提出的方法

分析嵌入式扩散在离散序列中的不足并将其与噪声尺度联系起来。
引入噪声尺度裁剪，确保训练避免进入小噪声区域，并将裁剪阈值适配嵌入空间特性。
提出 CeDi（-条件增强去噪器），通过高噪声指标在推理阶段强制依赖源条件。
采用类似 DDIM 的采样，调整时间步与两步调度，以突出源条件驱动的生成。
在潜变量扩散框架内提供带最小噪声阈值与重构项的训练目标 L'diffusion。

实验结果

研究问题

RQ1自适应噪声尺度是否能缓解基于扩散的序列学习中的离散性陷阱？
RQ2在训练中强制较高的最小噪声尺度是否能提升条件生成质量？
RQ3条件增强采样（CeDi）是否提升推理阶段对源条件的利用？
RQ4与自回归模型、CMLM 以及先前的扩散式序列模型相比，DiNoiSer 在多语言机器翻译、文本简化与改写任务上的表现如何？

主要发现

方法	IWSLT14 De→En	WMT14 En→De	WMT16 De→En	De→En	Ro→En	En→Ro
Transformer（AR，束宽=5）	33.61	28.30	30.55	26.85	33.08	32.86
CMLM（NAR，LB=5）	29.41	24.33	28.71	23.22	31.13	31.26
CMLM（NAR，LB=5，MBR=1）	29.32	24.34	28.43	23.09	31.07	30.92
DiffusionLM（LB=5，MBR=1）	26.61	20.29	17.31	15.33	28.61	27.01
DiffusionLM（LB=5，MBR=10）	29.11	22.91	19.69	17.41	30.17	29.39
CDCD（MBR=10）	-	-	25.40	19.70	-	-
CDCD（MBR=100）	-	-	26.00	20.00	-	-
Difformer（LBxMBR=20）	-	-	-	23.80	-	-
DiffuSeq（KD，LBxMBR=10）	-	-	-	15.37	-	25.45
SeqDiffuSeq（KD，LBxMBR=10）	-	-	-	17.14	-	26.17
DiNoiSer（LB=5，MBR=1）	31.29	25.55	28.83	24.25	31.14	30.93
DiNoiSer（LB=5，MBR=10）	31.61	25.70	29.05	24.26	31.22	31.08
DiNoiSer（LB=10，MBR=5）	31.44	26.14	29.01	24.62	31.24	31.03
DiNoiSer（KD，LB=10，MBR=5）	-	-	30.30	25.88	33.13	32.84

DiNoiSer 在若干条件序列任务（包括双语与多语翻译、文本简化与改写）上对扩散基线取得持续改进。
带有噪声尺度裁剪的训练避免进入小噪声区域，缓解离散性陷阱。
通过 CeDi 指标在推理阶段引入大噪声指示，增强对源条件的依赖并减少幻觉。
消融研究证实改进的训练（噪声裁剪）与改进的推理（CeDi 采样）均对性能提升有贡献。
事后分析表明条件增强去噪器更好地利用源条件进行准确预测。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。