QUICK REVIEW

[論文レビュー] RoboTurk: A Crowdsourcing Platform for Robotic Skill Learning through Imitation

Ajay Mandlekar, Yuke Zhu|arXiv (Cornell University)|Nov 7, 2018

Mobile Crowdsensing and Crowdsourcing被引用数 81

ひとこと要約

RoboTurk はモバイル機器を介して6-DoF ロボットデモンストレーションをクラウドソーシングし、模倣学習データの収集、ネットワーク条件への頑健性、および大量のデモンストレーションデータセットからの効果的な方策学習を可能にします。

ABSTRACT

Imitation Learning has empowered recent advances in learning robotic manipulation tasks by addressing shortcomings of Reinforcement Learning such as exploration and reward specification. However, research in this area has been limited to modest-sized datasets due to the difficulty of collecting large quantities of task demonstrations through existing mechanisms. This work introduces RoboTurk to address this challenge. RoboTurk is a crowdsourcing platform for high quality 6-DoF trajectory based teleoperation through the use of widely available mobile devices (e.g. iPhone). We evaluate RoboTurk on three manipulation tasks of varying timescales (15-120s) and observe that our user interface is statistically similar to special purpose hardware such as virtual reality controllers in terms of task completion times. Furthermore, we observe that poor network conditions, such as low bandwidth and high delay links, do not substantially affect the remote users' ability to perform task demonstrations successfully on RoboTurk. Lastly, we demonstrate the efficacy of RoboTurk through the collection of a pilot dataset; using RoboTurk, we collected 137.5 hours of manipulation data from remote workers, amounting to over 2200 successful task demonstrations in 22 hours of total system usage. We show that the data obtained through RoboTurk enables policy learning on multi-step manipulation tasks with sparse rewards and that using larger quantities of demonstrations during policy learning provides benefits in terms of both learning consistency and final performance. For additional results, videos, and to download our pilot dataset, visit $\\href{http://roboturk.stanford.edu/}{\ exttt{roboturk.stanford.edu}}$

研究の動機と目的

imitation learning のデータボトルネックに対処するためのスケーラブルな高品質なロボットデモンストレーション収集を動機づける。
ubiquituous devices（iPhone）を用いてリアルタイムに仮想ロボットを遠隔操作するクラウドソーシングプラットフォームを設計する。
RoboTurk がVR機器の性能に匹敵し、接続状況が悪くても耐性を示せることを示すためにユーザーインターフェースとネットワークの頑健性を評価する。
デモンストレーションのパイロットデータセットを作成し、このデータを用いた sparse rewards による方策学習を実証する。

提案手法

低遅延制御のためのWebRTCを用いたビデオとテレオペレーションコマンドをストリーミングするクラウドベースのプラットフォームを実装する。
モーションコントローラとしてARKitを搭載したiPhoneを用い、ポーズをロボットのエンドエフェクタ移動へマッピングする。
各ユーザーごとに専用テレオペレーションセッションを作成するコーディネーションサーバを提供し、スケーラブルなマルチユーザー運用を実現する。
新しいタスク、シミュレータ、ロボットへ容易に拡張できるモジュラーアーキテクチャを採用する。
インターフェース（Keyboard、3D Mouse、VR Controller、Phone）を比較し、異なるネットワーク条件下での性能を評価するユーザーユニットを実施する。
デモンストレーションからパイロットデータセットを収集・公開する（2200件以上のデモ、137時間）ことでデモンストレーションに基づく強化学習を可能にする。

実験結果

リサーチクエスチョン

RQ1RoboTurk は一般的なデバイスを用いたクラウドソーシングで大規模かつ高品質なテレオペレーションデモを収集できるか。
RQ2iPhoneベースのインターフェースはVRや他の入力と比較してタスク完了時間にどのように影響するか。
RQ3RoboTurk のデモはリモートテレオペレーションにおけるネットワーク遅延や帯域幅変動に対して頑健か。
RQ4より大きなデモデータセットは sparse-reward 操作タスクの方策学習を改善するか。

主な発見

電話インターフェースは、ピックアップタスクにおける完了時間がVRコントローラと統計的に類似しており、キーボードや3Dマウスよりも著しく高速である。
ベースラインから低帯域/高遅延を含むネットワーク条件下でも、完了時間の分布は類似を保ち、頑健性を示す。
パイロットデータセットは、20時間の使用で2200件以上のデモを含む137時間のデータを含む。
方策学習はデモの件数が多いほど効果が高まり、1000件のデモでcan-pickingとround-assemblyの両方で平均性能が最適化される。
デモンストレーションはPPOを用いてデモ状態からRLエピソードを初期化することができ、クラウドソースデータからのsparse-reward 操作学習の実現性を示している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。