QUICK REVIEW

[論文レビュー] Robots that redesign themselves through kinematic self-destruction

Cheng Yu, Sam Kriegman|arXiv (Cornell University)|Mar 12, 2026

Modular Robots and Swarm Intelligence被引用数 0

ひとこと要約

論文本体は、ロボットの体の冗長モジュールを自己破壊して再設計し、歩行改善を図るトランスフォーマー型 universal controller を提案。シミュレーションから実ロボットへの移行と、未知の形態への一般化を実現。

ABSTRACT

Every robot built to date was predesigned by an external process, prior to deployment. Here we show a robot that actively participates in its own design during its lifetime. Starting from a randomly assembled body, and using only proprioceptive feedback, the robot dynamically ``sculpts'' itself into a new design through kinematic self-destruction: identifying redundant links within its body that inhibit its locomotion, and then thrashing those links against the surface until they break at the joint and fall off the body. It does so using a single autoregressive sequence model, a universal controller that learns in simulation when and how to simplify a robot's body through self-destruction and then adaptively controls the reduced morphology. The optimized policy successfully transfers to reality and generalizes to previously unseen kinematic trees, generating forward locomotion that is more effective than otherwise equivalent policies that randomly remove links or cannot remove any. This suggests that self-designing robots may be more successful than predesigned robots in some cases, and that kinematic self-destruction, though reductive and irreversible, could provide a general adaptive strategy for a wide range of robots.

研究の動機と目的

展開時に古いまたは冗長な体の部品を除去してロボットの自己設計能力を促す。
proprioceptive feedback のみを用いて、形態が異なる場合にも機能する universal controller を開発する。
シミュレーションから実機へのエンドツーエンドの移行と、分布外の身体設計への一般化を実証する。
非破壊・ランダム破壊ベースラインと比較した性能向上を評価する。
制御された運動学的自己破壊がロボットの適応と長寿命化に有利であることを示す。

提案手法

自己破壊と歩行を時系列モデリング問題として定式化し、手作りモーフィズムに対して強化学習で専門エージェントを訓練する。
専門的軌跡を因果的トランスフォーマーに蒸留し、モジュールのデタッチとロボットの移動の両方を出力させる。
時刻ごとに報酬を分配して変位、軌道の効率、活性接続の保持をバランスさせて学習を導く。
Out-of-distribution 状態に遭遇した際の退化的ループを防ぐ Prompt Reset を導入する。
現実世界のロールアウトを訓練に組み込み、シミュ-to-現実のギャップを縮小する（現実のオープンループ軌跡を訓練に注入）。
デタッチを MuJoCo でトルクベースのモジュール除去としてモデリングし、ドメイン変動のためデタッチトルクをランダム化する。

実験結果

リサーチクエスチョン

RQ1単一の universal transformer コントローラが、さまざまな形態に対してモジュールの自己破壊とその後の歩行の双方を学習できるか。
RQ2kinematic self-destruction は、破壊なしまたはランダム破壊と比較して、見慣れない（分布外）形態で歩行性能を改善するか。
RQ3シミュレーションから実機への学習ポリシーの転移はどれほどうまくいくか。分布外デザインを含むか。
RQ4新規ロボット体に直面した際、提案された Prompt Reset は退化的挙動を緩和するか。
RQ5現実世界の軌跡を訓練へ取り入れることがシミュ-to-現実の性能に与える影響は。

主な発見

トランスフォーマーコントローラは自律的にデタッチすべきモジュールを選択し、破壊後に前進運動を達成する。
分布内では、自己破壊はランダム破壊よりも歩行性能を改善する（p = 0.033）。
シミュレーションで 100 種の分布外形態において、自己破壊は平均速度が高く（μ = 0.168 m/s, σ = 0.105）ベースライン（μ = 0.080 m/s, σ = 0.058, p < 0.001）より優れる。
Prompt Reset は退化的ループを減らし適応性を向上させる（アブレーションで Prompt Reset なしでは速度が遅くなる、p < 0.01）。
シミュ-to-現実転送: 2 台の分布内実機ロボットで redesign と locomotion が100%成功；分布外の現実形態でも成功、自己破壊はベースラインよりもより指向的で時には高速な locomotion を生み出す。
現実世界の結果は、 unseen morphologies において自己破壊設計がより信頼性の高い locomotion 軌跡を生むことを示した。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。