[论文解读] Rapid Adaptation of Particle Dynamics for Generalized Deformable Object Mobile Manipulation
RAPiD 通过两阶段的仿真到现实学习方法,从特权仿真数据和现实世界视觉观测中推断形状与动力学嵌入,实现对未知可变形对象动力学的快速适应,在两个真实任务上实现80%+的成功率。
We address the challenge of learning to manipulate deformable objects with unknown dynamics. In non-rigid objects, the dynamics parameters define how they react to interactions -- how they stretch, bend, compress, and move -- and they are critical to determining the optimal actions to perform a manipulation task successfully. In other robotic domains, such as legged locomotion and in-hand rigid object manipulation, state-of-the-art approaches can handle unknown dynamics using Rapid Motor Adaptation (RMA). Through a supervised procedure in simulation that encodes each rigid object's dynamics, such as mass and position, these approaches learn a policy that conditions actions on a vector of latent dynamic parameters inferred from sequences of state-actions. However, in deformable object manipulation, the object's dynamics not only includes its mass and position, but also how the shape of the object changes. Our key insight is that the recent ground-truth particle positions of a deformable object in simulation capture changes in the object's shape, making it possible to extend RMA to deformable object manipulation. This key insight allows us to develop RAPiD, a two-phase method that learns to perform real-robot deformable object mobile manipulation by: 1) learning a visuomotor policy conditioned on the object's dynamics embedding, which is encoded from the object's privileged information in simulation, such as its mass and ground-truth particle positions, and 2) learning to infer this embedding using non-privileged information instead, such as robot visual observations and actions, so that the learned policy can transfer to the real world. On a mobile manipulator with 22 degrees of freedom, RAPiD enables over 80%+ success rates across two vision-based deformable object mobile manipulation tasks in the real world, under various object dynamics, categories, and instances.
研究动机与目标
- 在现实场景中推动对具有未知动力学的可变形对象的操作。
- 开发一个两阶段学习框架,利用特权的仿真数据和非特权的现实观测来推断对象动力学。
- 仅使用车载传感器实现从仿真到真实机器人的一步到位的转移。
提出的方法
- 使用特权仿真数据训练一个受动力学嵌入和形状嵌入条件约束的 visuomotor 策略。
- 用形状自适应和动力学自适应模块替代编码器,通过对深度图像和动作的L1损失来推断嵌入。
- 在仿真中通过强化学习训练,然后用非特权输入对策略进行微调以实现现实世界转移。
- 以车载深度图像和机器人动作为输入,定期(每5个时间步)更新嵌入来部署策略。
- 将训练分为阶段I(编码器)和阶段II(自适应器)以在特权与非特权输入之间保持分离。
实验结果
研究问题
- RQ1RAPiD是否能够将对未知动力学的可变形对象操作推广到现实世界中的未知动力学、类别和实例?
- RQ2形状自适应和动力学自适应模块对可变形对象任务的性能有多大影响?
- RQ3推断对象形状的变化对成功操作可变形对象是否至关重要?
- RQ4端到端强化学习在没有两个适配阶段的情况下是否也能收敛,或不如两阶段训练?
- RQ5与仿真到现实基线相比,RAPiD对真正机器人任务的影响如何?
主要发现
- RAPiD在两个任务上相对于基线 DMfD 和 DDOD 显著优于对手。
- 在未见动力学下,RAPiD在1D_Inserting和2D_Covering上分别达到85%和80%的成功率。
- 消融实验显示若无自适应模块,性能下降52.5%;若无形状自适应模块,下降42.5%。
- 端到端训练(E2E)将成功率降低约60%,且收敛性不稳定。
- 两阶段方法实现对现实世界对象在不同动力学、类别和实例上的鲁棒零-shot转移。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。