QUICK REVIEW

[論文レビュー] Transferring End-to-End Visuomotor Control from Simulation to Real World for a Multi-Stage Task

Stephen James, Andrew J. Davison|arXiv (Cornell University)|Jul 7, 2017

Robot Manipulation and Learning参考文献 34被引用数 132

ひとこと要約

この論文は、domain randomisationとCNNを用いて、 simulationで学習したエンドツーエンド視覚運動制御を実世界へ転送し、マルチステージタスクを実現する方法を示す。

ABSTRACT

End-to-end control for robot manipulation and grasping is emerging as an attractive alternative to traditional pipelined approaches. However, end-to-end methods tend to either be slow to train, exhibit little or no generalisability, or lack the ability to accomplish long-horizon or multi-stage tasks. In this paper, we show how two simple techniques can lead to end-to-end (image to velocity) execution of a multi-stage task, which is analogous to a simple tidying routine, without having seen a single real image. This involves locating, reaching for, and grasping a cube, then locating a basket and dropping the cube inside. To achieve this, robot trajectories are computed in a simulator, to collect a series of control velocities which accomplish the task. Then, a CNN is trained to map observed images to velocities, using domain randomisation to enable generalisation to real world images. Results show that we are able to successfully accomplish the task in the real world with the ability to generalise to novel environments, including those with dynamic lighting conditions, distractor objects, and moving objects, including the basket itself. We believe our approach to be simple, highly scalable, and capable of learning long-horizon tasks that have until now not been shown with the state-of-the-art in end-to-end robot control.

研究の動機と目的

純粋にシミュレーションで訓練されたエンドツーエンド視覚運動制御が実世界でリアル画像なしに動作できることを実証する。
シミュレーター生成軌道を通じて長期目標のマルチステージタスク（locate, reach, grasp, locate basket, drop cube）を学ぶ。
domain randomisationを通じて現実世界の変動（照明、妨害物、動く物体）への一般化を改善する。
転送性能に対する補助出力およびネットワーク入力の影響を評価する。
環境変化とアブレーションに対する頑健性を評価し、転送要因の鍵を特定する。

提案手法

反復運動学を用いた大規模なシミュレータ軌道データセットを生成し、五段階タスクを実行する。
画像列と関節角度のシーケンスをモータ速度へマッピングするリアクティブCNNを訓練し、PIDループで制御する。
学習を補助する補助出力（キューブ位置とグリッパ位置）を追加して学習を助ける。
appearance, textures, lighting, object colours, positions, distractors, and camera height に domain randomisationを適用して sim-to-realギャップを橋渡しする。
多段階タスクの状態を捉えるために再帰的ネットワーク（LSTM）を用い、入力の一部として関節角度を含める。
グリッドベースの実世界テストで評価し、 varied training dataset sizesと環境条件間での性能を比較する。

実験結果

リサーチクエスチョン

RQ1シミュレーションと実世界の訓練データセットサイズによってコントローラの性能はどのように変化するか？
RQ2新規の実世界環境（妨害物、動く物体、照明変化、カメラの動き）への転送コントローラはどの程度頑健か？
RQ3転送成功に最も影響を与える domain randomisation の要素はどれか（textures, lighting, distractors, geometry, camera height）？
RQ4補助出力と関節角度の入力は転送性能を改善するか？
RQ5この多段タスクにおいてLSTM成分は必須か？

主な発見

ドメイン randomisation を用いたシミュレーション訓練は、実世界での複数段階タスク（locate, reach, grasp, place）実行へ転送可能であり、実画像なしで実現できる。
データセットサイズを増やすと実世界の性能が向上する；大まかには、1百万枚のシミュレート画像でベースラインでは distractorsなしで sim も real-world も100%の成功を達成する。
補助出力と関節角度入力は性能向上をもたらし、LSTMを除去するとマルチステージタスクの成功率が低下する。
コントローラは実世界の複数の摂動（妨害物、動く物体、照明変化、カメラの小さな動き）に対して頑健だが、強い妨害物や大きな物体外観の変化では性能が低下する。
アブレーション研究は、段階コンテキストを維持する上でLSTMの重要な役割と、把持時の姿勢安定化のための関節角度入力の重要性を示している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。