QUICK REVIEW

[论文解读] Semi-supervised Latent Disentangled Diffusion Model for Textile Pattern Generation

Chenggong Hu, Yi Wang|arXiv (Cornell University)|Mar 17, 2026

Generative Adversarial Networks and Image Synthesis被引用 0

一句话总结

SLDDM-TPG presents a two-stage framework combining a latent disentangled network (LDN) with a semi-supervised latent diffusion model (S-LDM) to generate faithful textile pattern images from clothing images, achieving high fidelity and good generalization on high-res datasets.

ABSTRACT

Textile pattern generation (TPG) aims to synthesize fine-grained textile pattern images based on given clothing images. Although previous studies have not explicitly investigated TPG, existing image-to-image models appear to be natural candidates for this task. However, when applied directly, these methods often produce unfaithful results, failing to preserve fine-grained details due to feature confusion between complex textile patterns and the inherent non-rigid texture distortions in clothing images. In this paper, we propose a novel method, SLDDM-TPG, for faithful and high-fidelity TPG. Our method consists of two stages: (1) a latent disentangled network (LDN) that resolves feature confusion in clothing representations and constructs a multi-dimensional, independent clothing feature space; and (2) a semi-supervised latent diffusion model (S-LDM), which receives guidance signals from LDN and generates faithful results through semi-supervised diffusion training, combined with our designed fine-grained alignment strategy. Extensive evaluations show that SLDDM-TPG reduces FID by 4.1 and improves SSIM by up to 0.116 on our CTP-HD dataset, and also demonstrate good generalization on the VITON-HD dataset.

研究动机与目标

解决从 clothing 图像生成 textile pattern 时的特征混淆挑战。
将衣物表示解耦为内容、结构和缺陷特征，以提高保真性。
利用半监督扩散训练来利用未标注数据并提升泛化能力。
为 TPG 引入高分辨率配对数据集（CTP-HD）。
提出对齐和局部相似机制以优化图案生成。

提出的方法

两阶段框架：潜在解耦网络（LDN）后接半监督潜在扩散模型（S-LDM）。
LDN 学习多种特征空间： textile pattern content (f_S^c)、texture defect (f_T^c)、以及 structured feature (f_A^c) ，利用 SCM、RAM 和 SATs。
SCM 使用基于 SimSiam 的相似性对比来提取衣物 C 与图案 P 之间的共享内容 f_S^c。
RAM 通过反向注意力和 SATs 产生稳定的结构特征并通过纹理三元组损失来实现分离。
S-LDM 使用带标签数据的去噪生成，并通过与 STD 损失和 CLS 模块的对齐过程来利用未标记数据。
对齐损失包括 STD（稳定转换域）、CLS（卷积局部相似）、LPIPS 和 MSE，在半监督框架内应用。

实验结果

研究问题

RQ1通过将潜在表征解耦为内容、结构和缺陷分量，是否可以缓解衣物特征混淆？
RQ2带对齐信号的半监督潜在扩散是否能提升 textile pattern 生成的保真度和鲁棒性？
RQ3所提出的 STD 和 CLS 模块对模式保真度与局部周期性的影响如何？
RQ4SLDDM-TPG 在高分辨率数据集与未见衣物上的泛化能力如何？
RQ5新引入的 CTP-HD 数据集是否在基线之上带来改进？

主要发现

在 CTP-HD 上，SLDDM-TPG 的 FID 更低，SSIM 更高，相比基线。
在 CTP-HD 上，与最佳基线相比，SLDDM-TPG 将 FID 降低 4.10，SSIM 提高 0.116，且达到 FPS 0.875。
在 VITON-HD 泛化方面，SLDDM-TPG 的 LPIPS（VLS）与 CTS 得分优于对比方法。
消融结果显示去除 LDN 组件会降低性能，尤以 textile pattern content feature f_S^c 为显著影响。
CLS 与 STD 组件提升了局部相似性与对齐稳定性，贡献于更高的保真度。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。