QUICK REVIEW

[論文レビュー] SayTap: Language to Quadrupedal Locomotion

Yujin Tang, Wenhao Yu|arXiv (Cornell University)|Jun 13, 2023

Human Pose and Action Recognition被引用数 11

ひとこと要約

本論文は、足接触パターンを自然言語コマンドとDRLベースの四足歩行コントローラの間のインターフェースとして導入し、実機へ転送可能な柔軟な言語駆動ロコモーションを実現します。

ABSTRACT

Large language models (LLMs) have demonstrated the potential to perform high-level planning. Yet, it remains a challenge for LLMs to comprehend low-level commands, such as joint angle targets or motor torques. This paper proposes an approach to use foot contact patterns as an interface that bridges human commands in natural language and a locomotion controller that outputs these low-level commands. This results in an interactive system for quadrupedal robots that allows the users to craft diverse locomotion behaviors flexibly. We contribute an LLM prompt design, a reward function, and a method to expose the controller to the feasible distribution of contact patterns. The results are a controller capable of achieving diverse locomotion patterns that can be transferred to real robot hardware. Compared with other design choices, the proposed approach enjoys more than 50% success rate in predicting the correct contact patterns and can solve 10 more tasks out of a total of 30 tasks. Our project site is: https://saytap.github.io.

研究の動機と目的

自然言語と低レベルのロコモーション制御を橋渡しすることで、四足歩行ロボットに対する直感的な人間-ロボット対話を動機づける。
自然言語とロコモーション制御器の間のコンパクトなインターフェースとして足接触パターンを提案する。
多様でリアルタイムなロコモーションを実現するための LLM-to-pattern モジュールと DRL ベースのコントローラを訓練する。
シミュレーションから実機の四足歩行機（Unitree A1）への学習済みコントローラの適用性を実証する。

提案手法

任意の自然言語コマンドを 4xLw の足接触パターンテンプレート（0/1）へ翻訳する LLM プロンプト戦略を設計する。
訓練時にランダムなパターン生成器を使用して、歩法タイプ（BOUND、TROT、PACE、STAND_STILL、STAND_3LEGS）全体で多様な接触パターンテンプレートを作成する。
固有感覚、速度指令、および希望する接触パターンを入力とし、関節位置を出力する DRL ポリシー（IsaacGym 上の PPO）を訓練する。
ポリシー出力に二重パス対称性の工夫を組み込み、歩行の自然さを向上させ、シミュレーションと実機のギャップを縮小する。
接触パターンの分布にコントローラを曝露し、明示的な軌道よりも接触タイミングへ報酬を合わせる。
LLM が生成した接触パターンを実機向けの低レベル指令へ翻訳し、過度なファインチューニングを行わずに実現する。

Figure 1: Illustration of the results on a physical quadrupedal robot. We show two test commands at the top, and the snapshots of the robot in the top row of the figure. The row in the middle shows the desired contact patterns translated from the commands by an LLM (the pattern in between the comman

実験結果

リサーチクエスチョン

RQ1足接触パターンは自然言語と低レベルの四足歩行制御との効果的なインターフェースになり得るか？
RQ2LLM は非構造化コマンドを多様な歩法に対応する実現可能な接触パターンテンプレートへどれだけ正確にマッピングできるか？
RQ3DRL コントローラは主要な移動タスクと指定された接触パターンの両方を実現し、シミュレーションから実機へ転送できるか？
RQ4言語駆動のインターフェースは現実世界で未構造かつ曖昧な自然言語コマンドをサポートするか？

主な発見

LLMベースのインターフェースは、30タスクで2つのベースラインと比較して、正しい接触パターンを予測する精度を約50%向上させる。
学習済みコントローラは、シミュレーションで指示された線形速度を追従しつつ、所望の接触パターンに近いものを生成し、微調整なしで実機の Unitree A1 へ転送される。
足接触パターン・インターフェースは、柔軟性と精度の点で離散歩法と正弦パラメータのベースラインを上回る。
システムは明示的な指示と未構造の自然言語表現の双方に応答でき、表現力豊かな人間-ロボット対話を可能にする。
この手法は実機での実運用が成功し、裏付けとなるビデオ映像（Figure 1）を示している。

Figure 2: Overview of the proposed approach. In addition to the robot’s proprioceptive sensory data and task commands (e.g., following a desired linear velocity $\hat{v}_{x}$ ), the locomotion controller accepts desired foot contact patterns as input, and outputs desired joint positions. The foot co

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。