Skip to main content
QUICK REVIEW

[论文解读] Spectrally-Guided Diffusion Noise Schedules

Carlos Esteves, Ameesh Makadia|arXiv (Cornell University)|Mar 19, 2026
Generative Adversarial Networks and Image Synthesis被引用 0
一句话总结

本文为每个实例设计了以谱为引导的噪声日程,用于像素扩散,在只有少量去噪步骤的条件下通过对每张图像的频谱进行条件化来提升图像质量。

ABSTRACT

Denoising diffusion models are widely used for high-quality image and video generation. Their performance depends on noise schedules, which define the distribution of noise levels applied during training and the sequence of noise levels traversed during sampling. Noise schedules are typically handcrafted and require manual tuning across different resolutions. In this work, we propose a principled way to design per-instance noise schedules for pixel diffusion, based on the image's spectral properties. By deriving theoretical bounds on the efficacy of minimum and maximum noise levels, we design ``tight'' noise schedules that eliminate redundant steps. During inference, we propose to conditionally sample such noise schedules. Experiments show that our noise schedules improve generative quality of single-stage pixel diffusion models, particularly in the low-step regime.

研究动机与目标

  • 推动通过将数据集级启发式方法转向逐实例、谱驱动的噪声日程来改进像素扩散。
  • 引入一个有原则的方法,使前向噪声和采样日程能够针对每张图像的光谱属性进行定制。
  • 提供最小/最大噪声水平的理论界限以及在采样前预测光谱的条件机制。
  • 在单阶段像素扩散模型的低步数 regime 下展示更高的生成质量与效率。
  • 探讨采样过程中的谱操控如何引导生成图像的纹理与细节。

提出的方法

  • 定义遵循图像功率谱(RAPSD)的逐实例噪声日程。
  • 推导最小/最大噪声水平的界限以创建“紧凑”日程。
  • 提出一个条件RAPSD采样器 S(y),将条件(例如类别)映射到 RAPSD 参数(α, β)以进行调度。
  • 计算三种日程类型(频率聚焦、功率聚焦、混合),并将它们映射到采样的对数信噪比 λ(t)。
  • 在训练阶段通过对每张图像拟合幂律 RAPSD 来训练日程,在推理时进行采样。
  • 修改条件化和引导区间以适应逐图像日程和 FiLM 基础的条件化。
Figure 1 : Our “tight” schedules adapt to each instance’s spectrum, ensuring effective noise levels at all steps. Top: An image with low energy on low frequencies. The standard cosine noise schedule destroys the signal at $t=0.5$ , which means that at least half of the training steps would apply exc
Figure 1 : Our “tight” schedules adapt to each instance’s spectrum, ensuring effective noise levels at all steps. Top: An image with low energy on low frequencies. The standard cosine noise schedule destroys the signal at $t=0.5$ , which means that at least half of the training steps would apply exc

实验结果

研究问题

  • RQ1是否可以利用逐实例的光谱属性来设计对像素扩散模型更有效的扩散噪声日程?
  • RQ2谱引导日程在保持或提升图像质量的前提下,是否能减少所需的去噪步骤?
  • RQ3频率聚焦、功率聚焦与混合日程在保真度(FID)、多样性(IS)以及精确性/召回方面的对比如何?
  • RQ4RAPSD 采样器是否可以从条件信号(如类别提示)预测基于光谱的日程参数,以实现端到端的无地面真值光谱采样?
  • RQ5谱操控对生成图像属性(纹理与细节)的影响是什么?

主要发现

模型参数NFEFID ↓sFID ↓IS ↑精确度 ↑召回 ↑
SiD2, small397M5122.19 (2.19)4.30295.30.720.63
Ours, small399M2561.794.39306.10.730.64
SiD2, Flop Heavy397M5121.53 (1.48)3.98306.20.740.63
Ours, Flop Heavy399M3201.453.91310.00.740.63
SiD2, small (ImageNet 128x128)397M5121.623.76220.00.730.64
Ours, small (ImageNet 128x128)399M1601.433.65223.90.740.64
SiD2, small (ImageNet 256x256)397M5121.68 (1.72)4.04288.20.720.65
Ours, small (ImageNet 256x256)399M2561.423.82297.00.730.65
SiD2, Flop Heavy (ImageNet 256x256)397M5121.37 (1.38)3.83286.30.730.65
Ours, Flop Heavy (ImageNet 256x256)399M2561.323.71294.20.740.64
  • 逐实例的谱引导日程在多个 ImageNet 分辨率上超越强劲的像素扩散基线 SiD2,尤其在低步数场景中表现突出。
  • 遵循图像光谱的“紧凑”日程可减少冗余步骤,并在较少去噪步骤时改善 FID/IS 权衡。
  • 频率聚焦、功率聚焦和混合日程提供互补收益;混合日程通常获得最佳总体性能。
  • RAPSD 采样器在推断阶段即可在很小的损失下近似逐图像光谱,实现基于类别或提示的端到端采样。
  • 对采样的 RAPSD 进行操控(如改变 α 指数)会改变图像纹理/细节,展示了输出的可控光谱效应。
  • 消融研究表明提出的逐实例条件化与两参数 RAPSD 采样对性能提升至关重要。
Figure 2 : Our noise schedules vary per instance based on its spectral properties. Left: Median power per frequency for ImageNet at multiple resolutions (increasing from light to dark). The power spectrum of natural images follows a power law whose trends explain current noise schedule tuning heuris
Figure 2 : Our noise schedules vary per instance based on its spectral properties. Left: Median power per frequency for ImageNet at multiple resolutions (increasing from light to dark). The power spectrum of natural images follows a power law whose trends explain current noise schedule tuning heuris

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。