QUICK REVIEW

[Paper Review] Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity

Deepak Pathak, Chris Xiaoxuan Lu|arXiv (Cornell University)|Feb 14, 2019

Modular Robots and Swarm Intelligence35 references45 citations

TL;DR

The paper trains primitive robotic limbs that self-assemble into morphologies and learn a modular controller via Dynamic Graph Networks, demonstrating improved generalization to unseen morphologies and environments compared to fixed-morphology baselines.

ABSTRACT

Contemporary sensorimotor learning approaches typically start with an existing complex agent (e.g., a robotic arm), which they learn to control. In contrast, this paper investigates a modular co-evolution strategy: a collection of primitive agents learns to dynamically self-assemble into composite bodies while also learning to coordinate their behavior to control these bodies. Each primitive agent consists of a limb with a motor attached at one end. Limbs may choose to link up to form collectives. When a limb initiates a link-up action, and there is another limb nearby, the latter is magnetically connected to the 'parent' limb's motor. This forms a new single agent, which may further link with other agents. In this way, complex morphologies can emerge, controlled by a policy whose architecture is in explicit correspondence with the morphology. We evaluate the performance of these dynamic and modular agents in simulated environments. We demonstrate better generalization to test-time changes both in the environment, as well as in the structure of the agent, compared to static and monolithic baselines. Project video and code are available at https://pathak22.github.io/modular-assemblies/

Motivation & Objective

Motivate modular self-assembly as a route to adaptable, generalizable agents inspired by multicellular organization.
Co-evolve control policies and morphology by treating linking/unlinking as actions within an RL framework.
Develop a modular policy that aligns with the evolving morphology via dynamic graph networks (DGN).
Demonstrate improved zero-shot generalization to novel morphologies and environments vs monolithic baselines.

Proposed method

Represent the self-assembled agent as a graph of limbs connected by magnetic joints.
Each limb runs a shared policy that outputs torques plus linking/unlinking actions.
Dynamics: topology of the graph changes over time based on policy outputs (DGN).
Message passing via edges to coordinate between connected limbs, with inputs limited to local sensory data.
Optimize with PPO to maximize the sum of limb rewards across the evolving graph.
Evaluate on standing up and locomotion tasks with varied terrains and limb counts.

Experimental results

Research questions

RQ1Can a jointly learned control-and-m morphology policy generalize to unseen morphologies and environments?
RQ2Does a modular, graph-structured policy transfer better to changes in the number of limbs than monolithic policies?
RQ3What is the impact of message passing in coordinating control across dynamically assembled morphologies?

Key findings

Dynamic Graph Network policies outperform monolithic baselines on standing and locomotion tasks.
DGN policies show strong zero-shot generalization to different limb counts (e.g., from 6 to 4 or 12 limbs).
Modularity in software (shared limb policies) and hardware (self-assembly) both contribute to better training and generalization than either alone.
DGN with message passing helps in long-horizon coordination (standing up) more than in locomotion where various morphologies can succeed.
Policies trained under a morphological curriculum generalize better to novel terrains and disturbances (wind, water, obstacles).

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.