QUICK REVIEW

[論文レビュー] Fully Distributed Multi-Robot Collision Avoidance via Deep Reinforcement Learning for Safe and Efficient Navigation in Complex Scenarios

Tingxiang Fan, Pinxin Long|arXiv (Cornell University)|Aug 11, 2018

Reinforcement Learning in Robotics参考文献 11被引用数 69

ひとこと要約

本論文は、マルチロボットシステム向けの完全な分散センサーレベル衝突回避ポリシーを、multi-scenario multi-stage deep reinforcement learningで訓練し、ハイブリッド制御フレームワークに統合し、dense crowdsや大規模なロボットチームを含むシミュレーションと実世界のシナリオで検証する。

ABSTRACT

In this paper, we present a decentralized sensor-level collision avoidance policy for multi-robot systems, which shows promising results in practical applications. In particular, our policy directly maps raw sensor measurements to an agent's steering commands in terms of the movement velocity. As a first step toward reducing the performance gap between decentralized and centralized methods, we present a multi-scenario multi-stage training framework to learn an optimal policy. The policy is trained over a large number of robots in rich, complex environments simultaneously using a policy gradient based reinforcement learning algorithm. The learning algorithm is also integrated into a hybrid control framework to further improve the policy's robustness and effectiveness. We validate the learned sensor-level collision avoidance policy in a variety of simulated and real-world scenarios with thorough performance evaluations for large-scale multi-robot systems. The generalization of the learned policy is verified in a set of unseen scenarios including the navigation of a group of heterogeneous robots and a large-scale scenario with 100 robots. Although the policy is trained using simulation data only, we have successfully deployed it on physical robots with shapes and dynamics characteristics that are different from the simulated agents, in order to demonstrate the controller's robustness against the sim-to-real modeling error. Finally, we show that the collision-avoidance policy learned from multi-robot navigation tasks provides an excellent solution to the safe and effective autonomous navigation for a single robot working in a dense real human crowd. Our learned policy enables a robot to make effective progress in a crowd without getting stuck. Videos are available at https://sites.google.com/view/hybridmrca

研究の動機と目的

分散型マルチロボットシステムにおける部分観測下での安全で効率的な衝突回避の課題に対処する。
inter-robot communicationなしでRaw sensor dataを速度指令へマッピングするポリシーを開発する。
学習したポリシーのロバスト性と実ロボット・複雑なシナリオへの転送性を向上させる。
分散导航と集中型导航の性能ギャップを縮小する。

提案手法

オンボードのセンサ測定を共有ポリシーを用いて速度指令へマッピングする完全に分散されたポリシーを提案する。
multi-scenario multi-stage reinforcement learningフレームワークとpolicy gradient更新を用いてシミュレーションで訓練する。
学習したポリシーを従来のコントローラと組み合わせるハイブリッド制御アーキテクチャを導入し、単純または出現的なシナリオに対応する。
2Dレーザースキャナー、相対ゴール位置、現在の速度を入力として、アクションサンプリングのための速度平均を出力するニューラルネットワークを用いる。
ポリシーをシミュレーションと実世界の実験の両方で訓練・検証し、異種ロボットや最大100台の大規模展開を含む。

実験結果

リサーチクエスチョン

RQ1完全に分散されたセンサーレベルのポリシーは、 inter-robot communicationなしで安全で効率的なナビゲーションを実現できるか。
RQ2リッチなシミュレーション環境で訓練されたポリシーは、未知の実世界および大規模シナリオにどれだけ一般化するか。
RQ3学習したポリシーと従来の制御を統合したハイブリッド制御は、安全性とロバスト性を改善するか。
RQ4部分観測とセンサノイズがマルチロボットシステムの衝突回避性能に与える影響は何か。

主な発見

分散センサーレベルの衝突回避ポリシーは、Raw sensor dataを直接ステアリング指令へマッピングし、inter-robot communicationなしで動作する。
multi-scenario multi-stage training frameworkは、未知のシナリオ、異種ロボット、および大規模集団への一般化を実現するポリシーを生み出す。
学習したポリシーと従来のコントローラを組み合わせるハイブリッド制御は、複雑なタスクにおいてロバスト性と安全性を向上させる。
学習したポリシーは、大規模な物理ロボットへ extensive tuningなしで展開でき、dense crowdsへ転送可能である。
実験は、大規模なロボット群へのスケーラビリティと、事前構築されたインフラなしでwarehouseに似た環境での有効性を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。