QUICK REVIEW

[논문 리뷰] UltraDexGrasp: Learning Universal Dexterous Grasping for Bimanual Robots with Synthetic Data

Sizhe Yang, Yiman Xie|arXiv (Cornell University)|2026. 03. 05.

Robot Manipulation and Learning인용 수 0

한 줄 요약

논문은 Universal dexterous bimanual grasping을 위한 UltraDexGrasp-20M을 생성하는 데이터 생성 파이프라인과, 시뮬레이션-실세계 전이 성능이 강하고 실제 환경에서 81.2%의 성공률을 달성하는 포인트클라우드 기반 정책을 제시한다.

ABSTRACT

Grasping is a fundamental capability for robots to interact with the physical world. Humans, equipped with two hands, autonomously select appropriate grasp strategies based on the shape, size, and weight of objects, enabling robust grasping and subsequent manipulation. In contrast, current robotic grasping remains limited, particularly in multi-strategy settings. Although substantial efforts have targeted parallel-gripper and single-hand grasping, dexterous grasping for bimanual robots remains underexplored, with data being a primary bottleneck. Achieving physically plausible and geometrically conforming grasps that can withstand external wrenches poses significant challenges. To address these issues, we introduce UltraDexGrasp, a framework for universal dexterous grasping with bimanual robots. The proposed data-generation pipeline integrates optimization-based grasp synthesis with planning-based demonstration generation, yielding high-quality and diverse trajectories across multiple grasp strategies. With this framework, we curate UltraDexGrasp-20M, a large-scale, multi-strategy grasp dataset comprising 20 million frames across 1,000 objects. Based on UltraDexGrasp-20M, we further develop a simple yet effective grasp policy that takes point clouds as input, aggregates scene features via unidirectional attention, and predicts control commands. Trained exclusively on synthetic data, the policy achieves robust zero-shot sim-to-real transfer and consistently succeeds on novel objects with varied shapes, sizes, and weights, attaining an average success rate of 81.2% in real-world universal dexterous grasping. To facilitate future research on grasping with bimanual robots, we open-source the data generation pipeline at https://github.com/InternRobotics/UltraDexGrasp.

연구 동기 및 목표

여러 물체의 크기와 모양에 걸쳐 이중 팔 로봇의 보편적 정교한 파지의 필요성을 동기화한다.
최적화 기반 합성 및 플래너 기반 시演을 통합하여 대규모 다전략 데이터셋을 생성한다.
합성 데이터만으로 보지 않은 물체에 일반화하는 간단하고 견고한 정책을 개발한다.
다양한 파지 전략에 대해 강력한 시뮬레이션-실세계 전이 및 실제 환경의 강건함을 입증한다.

제안 방법

최적화 기반 파지 합성 및 계획 기반 시演 생성을 통합하여 고품질의 다양하고 보편적인 이중 팔 파지를 생산한다.
UltraDexGrasp-20M: 20 million frames over 1,000 objects across multiple grasp strategies (two-handed, whole-hand, two-finger pinch, three-finger tripod).
점군 기반의 범용 파지 정책을 제안하고 PointNet++-스타일 특징으로 장면을 인코딩하며 단방향 주의(attention)를 갖는 디코더만 있는 트랜스포머를 사용하여 경계가 있는 가우시안 동작 분포를 예측한다.
접촉력에 대한 하위 문제를 선형 비선형 이중계층 문제로 해결하고, cuRobo 및 GPU-가속 솔버를 사용하여 상위 계층의 손자세 업데이트를 그라디언트 기반으로 수행한다.
네 단계의 시演 생성 파이프라인(pregrasp, grasp, squeeze, lift)을 사용하여 듀얼암 조작을 위한 협조적이고 충돌 없는 궤적을 만든다.

실험 결과

연구 질문

RQ1UltraDexGrasp-20M이 다양한 물체의 모양, 크기 및 무게에 대해 일반화하는 보편적 정교한 파지 정책을 가능하게 할 수 있는가?
RQ2제안된 정책은 시뮬레이션 및 실제 환경에서 baselines인 DP3와 DexGraspNet과 어떻게 비교되는가?
RQ3학습 데이터 양이 정책 성능 및 보지 않은 물체에 대한 일반화에 미치는 영향은 무엇인가?
RQ4정책의 핵심 설계 선택(경계 가우시안 동작 분포, 단방향 주의)이 파지 성공을 향상시키는가?
RQ5합성 데이터로 학습된 정책이 작업 특정 미세 조정 없이 실제 시나리오로 얼마나 잘 전이되는가?

주요 결과

UltraDexGrasp-20M으로 학습된 정책은_seen_ 및 _unseen_ 물체 전반에서 시뮬레이션에서 평균 84.0%의 성공을 달성했다(총 600개 물체).
Unseen 물체는 평균 83.4%의 성공으로 새로운 모양과 무게에 대한 일반화가 강력함을 시사한다.
시뮬레이션에서 제안된 정책은 평균적으로 DP3보다 37.3% 포인트 높은 성능을 보이며(84.0% 대 46.7%), DexGraspNet보다도 평균적으로 우수하다.
실세계 배치는 다양한 물체에 대해 평균 81.2%의 성공률을 달성하여 강건한 제로샷 시뮬레이션-실세계 전이를 입증한다.
절제된 분석은 경계 가우시안 동작 예측과 단방향 주의 각각이 상당한 성능 향상을 가져옴을 보여준다(상대적으로 10% 이상).
추가 데이터로 성능이 향상되며, 1M 프레임을 넘어서면 학습된 정책이 데이터 생성 기준선을 능가한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.