QUICK REVIEW

[논문 리뷰] Rapid Adaptation of Particle Dynamics for Generalized Deformable Object Mobile Manipulation

Bohan Wu, Roberto Martín-Martín|arXiv (Cornell University)|2026. 03. 18.

Robot Manipulation and Learning인용 수 0

한 줄 요약

RAPiD는 프리빌리지 시뮬레이션 데이터에서 형상 임베딩과 다이나믹 임베딩을 추론하고 실제 세계의 시각 관찰로부터 이를 보정하는 두 단계의 시뮬레이션-실제 학습 방식으로 알려지지 않은 변형 물체의 동역학에 빠르게 적응하며 두 가지 실제 작업에서 80%+의 성공을 달성한다.

ABSTRACT

We address the challenge of learning to manipulate deformable objects with unknown dynamics. In non-rigid objects, the dynamics parameters define how they react to interactions -- how they stretch, bend, compress, and move -- and they are critical to determining the optimal actions to perform a manipulation task successfully. In other robotic domains, such as legged locomotion and in-hand rigid object manipulation, state-of-the-art approaches can handle unknown dynamics using Rapid Motor Adaptation (RMA). Through a supervised procedure in simulation that encodes each rigid object's dynamics, such as mass and position, these approaches learn a policy that conditions actions on a vector of latent dynamic parameters inferred from sequences of state-actions. However, in deformable object manipulation, the object's dynamics not only includes its mass and position, but also how the shape of the object changes. Our key insight is that the recent ground-truth particle positions of a deformable object in simulation capture changes in the object's shape, making it possible to extend RMA to deformable object manipulation. This key insight allows us to develop RAPiD, a two-phase method that learns to perform real-robot deformable object mobile manipulation by: 1) learning a visuomotor policy conditioned on the object's dynamics embedding, which is encoded from the object's privileged information in simulation, such as its mass and ground-truth particle positions, and 2) learning to infer this embedding using non-privileged information instead, such as robot visual observations and actions, so that the learned policy can transfer to the real world. On a mobile manipulator with 22 degrees of freedom, RAPiD enables over 80%+ success rates across two vision-based deformable object mobile manipulation tasks in the real world, under various object dynamics, categories, and instances.

연구 동기 및 목표

실제 환경에서 알려지지 않은 동역학을 갖는 변형 물체의 조작을 유도한다.
privileged 시뮬레이션 데이터와 비privileged 실제 관찰을 활용해 물체 동역학을 추론하는 두 단계 학습 프레임워크를 개발한다.
온보드 센서만으로 시뮬레이션에서 실제 로봇으로 제로샷 전이를 가능하게 한다.

제안 방법

Dynamics Embedding과 Shape Embedding으로 조건부 시각운동정책을 훈련하기 위해 privileged 시뮬레이션 데이터를 사용한다.
깊이 이미지와 행동으로부터 임베딩을 추정하기 위해 인코더를 Shape Adaptation 및 Dynamics Adaptation 모듈로 대체하고 L1 손실로 학습한다.
RL로 시뮬레이션에서 훈련하고 비privileged 입력을 사용한 미세조정으로 실제 세계로의 전이를 가능하게 한다.
온보드 깊이 이미지를 사용하고 로봇 동작으로 정책을 주기적으로 임베딩 업데이트(매 5 타임스텝) 한다.
privileged 입력과 비privileged 입력 간의 분리를 유지하기 위해 Phase I(인코더)와 Phase II(적응기)로 학습을 분리한다.

실험 결과

연구 질문

RQ1RAPiD가 실제 세계에서 보지 못한 동역학, 카테고리 및 인스턴스에 대한 변형 물체 조작을 일반화할 수 있는가?
RQ2Shape Adaptation 모듈과 Dynamics Adaptation 모듈이 변형 물체 작업의 성능에 얼마나 중요한가?
RQ3변형 물체를 성공적으로 조작하기 위해 물체의 형태 변화 추정이 필수적인가?
RQ4두 가지 적응 단계가 없는 엔드-투-엔드 RL이 두 단계 학습만큼 수렴하는가?
RQ5시뮬레이션-실제 baselines와 비교했을 때 RAPiD가 실제 로봇 작업에 주는 영향은 무엇인가?

주요 결과

방법	1D_Inserting 성공 (20개 중)	2D_Covering 성공 (20개 중)	총 성공 (40개 중)	전체 성공률
RAPiD	17	16	33 / 40	82.5%
DMfD	3	1	4 / 40	10%
DDOD	2	5	7 / 40	17.5%
RAPiD-No-Adapt	7	5	12 / 40	30%
RAPiD-No-Shape	7	9	16 / 40	40%
RAPiD-E2E	5	4	9 / 40	22.5%

RAPiD는 두 가지 작업에서 baselines인 DMfD와 DDOD를 상당한 차이로 능가한다.
RAPiD는 미지의 동역학에서 1D_Inserting에서 85%, 2D_Covering에서 80%의 성공을 달성한다.
적응 모듈 없이의 제거(ablations)에서 52.5% 감소, Shape Adaptation 모듈이 없으면 42.5% 감소가 나타난다.
End-to-end 훈련(E2E)은 성공을 약 60% 감소시키고 안정적으로 수렴하지 못한다.
두 단계 접근법은 다양한 동역학, 카테고리 및 인스턴스에 걸쳐 실제 세계 물체로의 제로샷 전이를 견고하게 가능하게 한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.