QUICK REVIEW

[论文解读] SoMA: A Real-to-Sim Neural Simulator for Robotic Soft-body Manipulation

Mu Huang, Hui Wang|arXiv (Cornell University)|Feb 2, 2026

Robot Manipulation and Learning被引用 0

一句话总结

SoMA 是一个基于真实到仿真的神经仿真器，利用高斯斑点在可变形物体动力学中进行建模，支持稳定的、长时域、以动作为条件的软体操控，并在重新仿真与泛化方面有所提升。它实现了最先进的 RGB/深度保真度，并支持诸如 T 恤折叠等复杂任务。

ABSTRACT

Simulating deformable objects under rich interactions remains a fundamental challenge for real-to-sim robot manipulation, with dynamics jointly driven by environmental effects and robot actions. Existing simulators rely on predefined physics or data-driven dynamics without robot-conditioned control, limiting accuracy, stability, and generalization. This paper presents SoMA, a 3D Gaussian Splat simulator for soft-body manipulation. SoMA couples deformable dynamics, environmental forces, and robot joint actions in a unified latent neural space for end-to-end real-to-sim simulation. Modeling interactions over learned Gaussian splats enables controllable, stable long-horizon manipulation and generalization beyond observed trajectories without predefined physical models. SoMA improves resimulation accuracy and generalization on real-world robot manipulation by 20%, enabling stable simulation of complex tasks such as long-horizon cloth folding.

研究动机与目标

在真实到仿真软体操控中桥接基于物理的与数据驱动的方法。
在统一的学习空间中表示可变形对象、环境和机器人动作。
实现以动作为条件的、稳定的长时域仿真，并具遮挡处理能力。
提供可扩展的训练策略，使泛化超越观测轨迹。

提出的方法

将可变形对象表示为带学习动态的高斯斑点分层图。
建立机器人条件的真实到仿真映射，使动力学锚定于机器人关节动作。
建模力驱动的相互作用：环境力与机器人作用力施加在斑点上并在层级中传播。
采用两阶段多分辨率训练：从粗到细的时间分辨率和图像分辨率，以稳定长时域动力学。
应用遮挡感知的图像监督，结合掩码损失和动量一致性正则化以覆盖未观测区域。

Figure 2 : Framework of SoMA. SoMA takes RGB observations and robot joint-space actions collected from real-world manipulation as input (Left). It reconstructs deformable objects as hierarchical Gaussian splats, and propagates them through a neural simulator with supervision from rendering and dynam

实验结果

研究问题

RQ1SoMA 是否能够在多样化机器人动作下准确重新仿真可变形对象动力学？
RQ2机器人条件是否提升对未见动作和接触配置的泛化能力？
RQ3多分辨率训练与混合监督是否能稳定长时域的软体操控？
RQ4在真实到仿真操控中，SoMA 与基于物理和神经网络的基线在 RGB/深度保真度上的比较如何？
RQ5在遮挡下，SoMA 是否具备像长时域 T 恤折叠这样的复杂任务能力？

主要发现

SoMA 在 RGB 和深度性能方面达到最先进水平，在重新仿真与泛化任务上超越基线。
它支持在机器人操控下对可变形对象进行长时域、与交互一致的仿真。
SoMA 在未见动作与接触配置的泛化方面优于 PhysTwin 和 GausSim（在所报道的实验中）。
在 T 恤折叠任务中，SoMA 展现出与基线相比更一致的几何与动力学，并且 artifact 更少。
消融研究表明多分辨率训练与混合监督对稳定性与泛化性的重要性。

Figure 3 : Qualitative resimulation and generalization under robot manipulation. Left: resimulation on training trajectories. Right: generalization to unseen robot actions and contact configurations. Across diverse soft-body objects, including near-linear (rope), near-planar (cloth), and volumetric

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。