QUICK REVIEW

[논문 리뷰] Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning

Viktor Makoviychuk, Lukasz Wawrzyniak|arXiv (Cornell University)|2021. 08. 24.

Parallel Computing and Optimization Techniques참고 문헌 26인용 수 55

한 줄 요약

Isaac Gym은 단일 GPU에서 끝에서 끝까지 GPU 가속 물리 시뮬레이션과 PPO 기반 정책 학습을 제공하여 수십에서 수천 개의 병렬 환경과 로봇 공학 작업의 RL 학습을 2–3 orders of magnitude faster RL training으로 가속합니다.

ABSTRACT

Isaac Gym offers a high performance learning platform to train policies for wide variety of robotics tasks directly on GPU. Both physics simulation and the neural network policy training reside on GPU and communicate by directly passing data from physics buffers to PyTorch tensors without ever going through any CPU bottlenecks. This leads to blazing fast training times for complex robotics tasks on a single GPU with 2-3 orders of magnitude improvements compared to conventional RL training that uses a CPU based simulator and GPU for neural networks. We host the results and videos at \url{https://sites.google.com/view/isaacgym-nvidia} and isaac gym can be downloaded at \url{https://developer.nvidia.com/isaac-gym}.

연구 동기 및 목표

RL 학습 속도를 가속화하기 위한 고처리량의 엔드투엔드 GPU 로봇 시뮬레이션의 필요성을 제시한다.
CPU 병목을 최소화하기 위해 시뮬레이션과 학습을 GPU에 유지하는 GPU 네이티브 플랫폼으로서 Isaac Gym을 소개한다.
토치(PyTorch) 텐서로 물리 버퍼를 래핑해 매끄러운 학습 루프를 구성하는 Tensor API 및 데이터 워크플로를 설명한다.
다양한 로봇 환경과 작업 전반에 걸친 성능 향상을 입증한다.
선정된 로봇에서의 시뮬레이션 대 실제 전이(Sim-to-Real) 능력을 선보인다.

제안 방법

병렬 환경 시뮬레이션을 위한 GPU 가속 물리 백엔드로 NVIDIA PhysX를 사용한다.
물리 상태 및 제어 텐서를 CPU 데이터 전송 없이 PyTorch에 노출하는 데이터 지향 Tensor API를 제공한다.
정교한 GPU 병렬성을 활용하기 위해 수천 개의 환경 인스턴스를 단일 장면에 패킹한다.
물리 버퍼를 PyTorch 텐서로 래핑하고 더 빠른 학습 스크립트를 위한 TorchScript를 지원하는 Python 인터페이스를 제공한다.
rl_games를 사용한 최적화를 위해 GPU에서 벡터화된 관측/행동으로 PPO 기반 학습 파이프라인을 구현한다.

실험 결과

연구 질문

RQ1시뮬레이션과 학습이 모두 완전히 GPU에서 실행될 때 정책 학습은 얼마나 빨리 달성될 수 있는가?
RQ2병렬 환경의 수를 늘릴 때의 확장 한계와 성능 특성은 무엇인가?
RQ3엔드투엔드 GPU 기반 RL 성능은 GPU 정책 학습이 있는 CPU 기반 시뮬레이터와 어떻게 비교되는가?
RQ4Isaac Gym은 현실적인 접촉과 도메인 무작위화를 포함한 복잡한 로봇 조작 환경을 지원할 수 있는가?
RQ5ANYmal 및 TriFinger와 같은 로봇에서 플랫폼이 보여주는 시뮬레이션-실제로의 전이 능력은 무엇인가?

주요 결과

Training times reach 20 seconds for Ant and 4 minutes for Humanoid locomotion on a single A100 GPU.
ANYmal locomotion can be trained in under 2 minutes on a single GPU.
Humanoid character animation with AMP reaches 6 minutes, and Shadow Hand cube rotation in 35 minutes on a single GPU.
OpenAI Shadow Hand cube training results are reproducible with comparable success rates (e.g., 20 successes with feed forward) using asymmetric actor-critic and domain randomization, on a single GPU.
Sim-to-real transfer demonstrations are shown for ANYmal and TriFinger, indicating high-fidelity contact-rich manipulation capabilities.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.