[论文解读] AdaEdit: Adaptive Temporal and Channel Modulation for Flow-Based Image Editing
AdaEdit 引入一个无需训练的自适应编辑框架,用于基于流的图像编辑,采用渐进注入计划和按通道的潜在扰动,在最小化编辑质量损失的前提下提升背景保真。
Inversion-based image editing in flow matching models has emerged as a powerful paradigm for training-free, text-guided image manipulation. A central challenge in this paradigm is the injection dilemma: injecting source features during denoising preserves the background of the original image but simultaneously suppresses the model's ability to synthesize edited content. Existing methods address this with fixed injection strategies -- binary on/off temporal schedules, uniform spatial mixing ratios, and channel-agnostic latent perturbation -- that ignore the inherently heterogeneous nature of injection demand across both the temporal and channel dimensions. In this paper, we present AdaEdit, a training-free adaptive editing framework that resolves this dilemma through two complementary innovations. First, we propose a Progressive Injection Schedule that replaces hard binary cutoffs with continuous decay functions (sigmoid, cosine, or linear), enabling a smooth transition from source-feature preservation to target-feature generation and eliminating feature discontinuity artifacts. Second, we introduce Channel-Selective Latent Perturbation, which estimates per-channel importance based on the distributional gap between the inverted and random latents and applies differentiated perturbation strengths accordingly -- strongly perturbing edit-relevant channels while preserving structure-encoding channels. Extensive experiments on the PIE-Bench benchmark (700 images, 10 editing types) demonstrate that AdaEdit achieves an 8.7% reduction in LPIPS, a 2.6% improvement in SSIM, and a 2.3% improvement in PSNR over strong baselines, while maintaining competitive CLIP similarity. AdaEdit is fully plug-and-play and compatible with multiple ODE solvers including Euler, RF-Solver, and FireFlow. Code is available at https://github.com/leeguandong/AdaEdit
研究动机与目标
- 识别反演式基于流的编辑中固定注入策略的局限性(注入困境)。
- 开发一个无需训练的框架,能够自适应管理时间注入与通道扰动。
- 评估渐进式计划与通道感知扰动在不妨碍编辑质量的前提下提升背景保留。
- 证明与多种 ODE 求解器的即插即用兼容性,在全面基准上进行评估。
提出的方法
- 用连续衰减计划(S 型、余弦、线性)替代二元时间注入,以在时间上平滑地减少源特征注入。
- 从倒置潜在变量与随机潜在变量之间的分布差计算每通道的重要性,并应用通道特定的扰动强度。
- 在 Latents-Shift 过程中应用通道相关的 AdaIN,以偏向扰动与编辑相关的通道,同时保留结构通道。
- 可选在消融中探索 Soft Mask 和 Adaptive KV Ratio 作为额外模块。
- 确保 AdaEdit 与 Euler、RF-Solver、FireFlow 等求解器无须重新训练即可实现即插即用。
实验结果
研究问题
- RQ1渐进的、非二元注入计划如何影响编辑伪影与背景保留?
- RQ2是否通过在编辑相关通道上聚焦扰动的通道特异权重来提升编辑质量?
- RQ3在 PIE-Bench 上,AdaEdit 对背景保真度与编辑准确度的影响如何?
- RQ4在无需训练的设置下,AdaEdit 与不同 ODE 求解器的兼容性如何?
主要发现
| Method | LPIPS ↓ | SSIM ↑ | PSNR ↑ | CLIP ↑ |
|---|---|---|---|---|
| ProEdit | 0.2960 | 0.7244 | 19.13 | 0.2617 |
| AdaEdit (ours) | 0.2703 | 0.7433 | 19.58 | 0.2593 |
- 与 ProEdit 相比,AdaEdit 在 PIE-Bench 上实现了 8.7% 的 LPIPS 降幅。
- AdaEdit 的 SSIM 相较 ProEdit 提升了 2.6%。
- AdaEdit 的 PSNR 相较 ProEdit 提升了 2.3%。
- AdaEdit 的 CLIP 相似度保持具有竞争力,相对于 ProEdit 略微下降 (-0.9%)。
- 该方法在 10 种编辑类型上显著提升背景保留,总体保留效果更佳且编辑准确度未出现大幅下降。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。