QUICK REVIEW

[论文解读] The Virtual Tailor: Predicting Clothing in 3D as a Function of Human Pose, Shape and Garment Style

Chaitanya Patel, Zhouyingcheng Liao|arXiv (Cornell University)|Mar 10, 2020

3D Shape Modeling and Analysis参考文献 48被引用 13

一句话总结

TailorNet 是一种神经模型，通过联合建模姿态、体型和服装风格，预测详细的 3D 服装形变，并将形变分解为低频和高频分量。其结果在保持褶皱的同时，比基于物理的模拟快 1000 倍以上，同时保持可微分性与运动序列的时间一致性。

ABSTRACT

In this paper, we present TailorNet, a neural model which predicts clothing deformation in 3D as a function of three factors: pose, shape and style (garment geometry), while retaining wrinkle detail. This goes beyond prior models, which are either specific to one style and shape, or generalize to different shapes producing smooth results, despite being style specific. Our hypothesis is that (even non-linear) combinations of examples smooth out high frequency components such as fine-wrinkles, which makes learning the three factors jointly hard. At the heart of our technique is a decomposition of deformation into a high frequency and a low frequency component. While the low-frequency component is predicted from pose, shape and style parameters with an MLP, the high-frequency component is predicted with a mixture of shape-style specific pose models. The weights of the mixture are computed with a narrow bandwidth kernel to guarantee that only predictions with similar high-frequency patterns are combined. The style variation is obtained by computing, in a canonical pose, a subspace of deformation, which satisfies physical constraints such as inter-penetration, and draping on the body. TailorNet delivers 3D garments which retain the wrinkles from the physics based simulations (PBS) it is learned from, while running more than 1000 times faster. In contrast to PBS, TailorNet is easy to use and fully differentiable, which is crucial for computer vision algorithms. Several experiments demonstrate TailorNet produces more realistic results than prior work, and even generates temporally coherent deformations on sequences of the AMASS dataset, despite being trained on static poses from a different dataset. To stimulate further research in this direction, we will make a dataset consisting of 55800 frames, as well as our model publicly available at this https URL.

研究动机与目标

为解决学习高保真 3D 服装形变的挑战，即在保持精细褶皱的同时，泛化于姿态、体型和服装风格。
克服先前模型的局限性，即缺乏细节或无法在不同风格和体型间泛化。
为计算机视觉和动画流水线提供快速、可微分且时间一致的 3D 服装预测。
开发一种方法，通过分离并建模低频和高频形变分量，保留高频褶皱细节。

提出的方法

将 3D 服装形变分解为低频和高频分量，以单独隔离和建模精细褶皱。
使用多层感知机（MLP）基于姿态、体型和风格参数预测低频分量。
对高频分量采用基于体型和风格的特定姿态模型的混合，通过核加权方法仅组合相似的高频模式。
构建一个满足物理约束（如非穿透和正确垂坠）的规范空间形变子空间。
在来自多样化数据集的静态 3D 服装姿态上端到端训练模型，从而实现对未见姿态、体型和风格组合的泛化。
利用可微分性，实现与基于优化的计算机视觉流水线的集成，并在仅使用静态数据训练的情况下，实现运动序列的时间一致性。

实验结果

研究问题

RQ1深度学习模型能否联合预测在多样化姿态、体型和服装风格下的 3D 服装形变，同时保留精细褶皱细节？
RQ2将形变分解为低频和高频分量，是否相比端到端学习能更好地建模高频特征（如褶皱）？
RQ3在静态姿态上训练的模型能否泛化到运动序列，生成时间一致的形变？
RQ4与基于物理的模拟相比，所提方法在逼真度、速度和可微分性方面的性能如何？

主要发现

TailorNet 生成的 3D 服装保留了训练时使用的基于物理模拟的精细褶皱细节。
该模型比基于物理的模拟快超过 1000 倍，同时保持了高视觉保真度。
尽管仅在静态姿态上进行训练，TailorNet 在 AMASS 数据集的运动序列上仍能生成时间一致的形变。
使用基于核加权的体型和风格特定高频模型混合，实现了精确且局部化的褶皱预测。
该模型完全可微分，适用于集成到基于优化的计算机视觉流水线中。
该方法在未见的姿态、体型和风格组合上泛化良好，其逼真度优于先前的风格特定或体型泛化模型。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。