QUICK REVIEW

[论文解读] SwiftTailor: Efficient 3D Garment Generation with Geometry Image Representation

Phuc Pham, Uy Dieu Tran|arXiv (Cornell University)|Mar 19, 2026

3D Shape Modeling and Analysis被引用 0

一句话总结

SwiftTailor 提出一个两阶段框架（PatternMaker 与 GarmentSewer），通过新颖的 Garment Geometry Image 生成 3D 服装，实现从图样到网格的快速端到端构建，无需基于物理仿真的过程。

ABSTRACT

Realistic and efficient 3D garment generation remains a longstanding challenge in computer vision and digital fashion. Existing methods typically rely on large vision- language models to produce serialized representations of 2D sewing patterns, which are then transformed into simulation-ready 3D meshes using garment modeling framework such as GarmentCode. Although these approaches yield high-quality results, they often suffer from slow inference times, ranging from 30 seconds to a minute. In this work, we introduce SwiftTailor, a novel two-stage framework that unifies sewing-pattern reasoning and geometry-based mesh synthesis through a compact geometry image representation. SwiftTailor comprises two lightweight modules: PatternMaker, an efficient vision-language model that predicts sewing patterns from diverse input modalities, and GarmentSewer, an efficient dense prediction transformer that converts these patterns into a novel Garment Geometry Image, encoding the 3D surface of all garment panels in a unified UV space. The final 3D mesh is reconstructed through an efficient inverse mapping process that incorporates remeshing and dynamic stitching algorithms to directly assemble the garment, thereby amortizing the cost of physical simulation. Extensive experiments on the Multimodal GarmentCodeData demonstrate that SwiftTailor achieves state-of-the-art accuracy and visual fidelity while significantly reducing inference time. This work offers a scalable, interpretable, and high-performance solution for next-generation 3D garment generation.

研究动机与目标

推动与工业工作流对齐的高效、可解释的 3D 服装生成。
引入一个紧凑的端到端缝制图样到 3D 网格流水线，避免物理仿真。
提出 Garment Geometry Image (GGI) 作为 3D 服装表面的统一二维表示。
开发 PatternMaker 用于缝制图样推理，GarmentSewer 从图样预测几何。
在 GarmentCodeData 基准上展示最先进的准确性和显著降低的推理时间。

提出的方法

PatternMaker 是一个轻量级的多模态语言模型，能够从文本或图像输入预测缝制图样。
GarmentSewer 是一个密集预测 Transformer，将语义缝制图样信息映射到 Garment Geometry Image (GGI)。
GGI 将缝制图样中的语义、几何和缝合组件重新打包，整合到统一的 UV 空间。
一种逆映射后处理步骤（网格重建与缝合）在不进行物理缝合的情况下从 GGI 重构最终的三维网格。
训练过程采用回归、缝合和法线正则化损失，结合边缘感知加权以强调边界。

实验结果

研究问题

RQ1PatternMaker 是否能够从多模态输入更高效地生成准确且拓扑有效的缝制图样，相较于更大规模的 LLM 基线？
RQ2GarmentSewer 是否能可靠预测密集几何图像，从而在不进行物理仿真的情况下实现准确的 3D 服装重建？
RQ3GGI 是否能有效在多模态与任务之间桥接二维图样推理与三维网格构建？
RQ4SwiftTailor 与基于 GarmentCode 的流水线在准确性、多样性和推理时间方面的对比如何？

主要发现

PatternMaker 以仅占大基线参数的 30% 的参数实现更高的图样准确性和拓扑有效性。
SwiftTailor 在服装网格生成方面达到最先进的 MMD 与 COV，并且第二阶段推理更快（0.02s），与基于 GarmentCode 的基线相比整体推理时间约快 4 倍。
GGI 通过可学习的 GarmentSewer 实现了从缝制图样到三维网格的高效转换，在初始状态质量和稳定性方面优于物理基 GarmentCode。
语义 UV 映射对 GarmentSewer 保持拓扑和缝合完整性至关重要。
消融研究表明边缘感知回归和缝合损失对高质量缝合对齐和几何形状至关重要。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。