QUICK REVIEW

[论文解读] Robots that redesign themselves through kinematic self-destruction

Cheng Yu, Sam Kriegman|arXiv (Cornell University)|Mar 12, 2026

Modular Robots and Swarm Intelligence被引用 0

一句话总结

这篇论文提出了一种基于变换器的通用控制器，学习在机器人身上自我拆除冗余模块以重新设计自身，从而提升运动能力，实现从仿真到真实机器人转移，并对未知形态具有泛化能力。

ABSTRACT

Every robot built to date was predesigned by an external process, prior to deployment. Here we show a robot that actively participates in its own design during its lifetime. Starting from a randomly assembled body, and using only proprioceptive feedback, the robot dynamically ``sculpts'' itself into a new design through kinematic self-destruction: identifying redundant links within its body that inhibit its locomotion, and then thrashing those links against the surface until they break at the joint and fall off the body. It does so using a single autoregressive sequence model, a universal controller that learns in simulation when and how to simplify a robot's body through self-destruction and then adaptively controls the reduced morphology. The optimized policy successfully transfers to reality and generalizes to previously unseen kinematic trees, generating forward locomotion that is more effective than otherwise equivalent policies that randomly remove links or cannot remove any. This suggests that self-designing robots may be more successful than predesigned robots in some cases, and that kinematic self-destruction, though reductive and irreversible, could provide a general adaptive strategy for a wide range of robots.

研究动机与目标

在部署期间通过移除过时或冗余的身体部件来激发机器人自设计的能力。
开发一个在不同形态间通用的控制器，仅使用本体感知反馈。
展示端到端的仿真到现实转移以及对超出分布的身体计划的泛化。
在非破坏或随机破坏基线下评估性能提升。
证明受控的运动学自我拆解对于机器人适应性和寿命具有潜在优势。

提出的方法

将自我拆解和运动作为序列建模问题来表述，并在手工设计的形态上通过强化学习训练专家策略。
将专家轨迹蒸馏成一个因果变换器，使其输出动作以实现拆卸模块和推动机器人移动。
在每个时间步设定奖励权衡位移、轨迹效率以及对活跃连接的保留，以引导学习。
引入Prompt Reset以在遇到超出分布的状态时防止退化循环。
通过引入真实世界 rollouts 来缩小仿真到现实的差距（将开放回路的真实轨迹注入训练中）。
在 MuJoCo 中将拆卸建模为基于扭矩的模块移除，并对拆卸扭矩进行随机化以实现领域变异。）

实验结果

研究问题

RQ1单一的通用变换器控制器是否能够在多种形态下同时学习模块自我拆除和随后的运动？
RQ2相较于不进行拆除或进行随机拆除的基线，运动学自我拆除是否在对未知（超出分布）形态上提升了运动性能？
RQ3从仿真到真实机器人学习策略的转移效果如何，包括对超出分布的设计？
RQ4所提出的Prompt Reset机制是否缓解面对新形体时的退化行为？
RQ5在训练中加入真实世界轨迹对仿真到现实性能有何影响？

主要发现

变换器控制器可以自动选择要拆卸的模块，在拆除后实现前进运动。
在分布内场景中，自我拆除相比随机拆除在运动性能上有提升（p = 0.033）。
在仿真中对100个超出分布的形态进行评估时，自我拆除的平均速度（μ = 0.168 m/s，σ = 0.105）高于基线（μ = 0.080 m/s，σ = 0.058，p < 0.001）。
Prompt Reset 可以减少退化循环，提升适应性（消融实验显示若无 Prompt Reset 速度更慢，p < 0.01）。
仿真到现实转移：两种分布内的物理机器人实现了100%的重设计和运动成功；对超出分布的真实形态也实现了成功，自我拆除在方向性和有时更快的运动方面优于基线。
真实世界结果表明自我拆解设计在未知形态上能提供更可靠的运动轨迹。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。