[論文レビュー] Data-Driven Physics Embedded Dynamics with Predictive Control and Reinforcement Learning for Quadrupeds
This paper integrates Lagrangian Neural Networks into an RL–MPC framework for quadruped locomotion, using inverse-dynamics MPC to achieve real-time planning with physical consistency and improved sample efficiency.
State of the art quadrupedal locomotion approaches integrate Model Predictive Control (MPC) with Reinforcement Learning (RL), enabling complex motion capabilities with planning and terrain adaptive behaviors. However, they often face compounding errors over long horizons and have limited interpretability due to the absence of physical inductive biases. We address these issues by integrating Lagrangian Neural Networks (LNNs) into an RL MPC framework, enabling physically consistent dynamics learning. At deployment, our inverse dynamics infinite horizon MPC scheme avoids costly matrix inversions, improving computational efficiency by up to 4x with minimal loss of task performance. We validate our framework through multiple ablations of the proposed LNN and its variants. We show improved sample efficiency, reduced long-horizon error, and faster real time planning compared to unstructured neural dynamics. Lastly, we also test our framework on the Unitree Go1 robot to show real world viability.
研究の動機と目的
- Motivate combining physics-based inductive biases with data-driven learning to improve interpretability and long-horizon planning in quadruped locomotion.
- Develop a Lagrangian Neural Network (LNN) that yields physically consistent dynamics for model-based planning.
- Create an RL–MPC training framework with Dreamer-informed physics to improve sample efficiency and robustness.
- Deploy an inverse-dynamics MPC planner at deployment to reduce computation and maintain performance on real hardware.
提案手法
- Parameterize the mass matrix to be symmetric positive-definite using a learnable lower-triangular factor.
- Learn LNN-based dynamics and use a Dreamer module to generate physics-informed imagined trajectories for policy training.
- Use an encoder to map proprioceptive history to a full state estimate, enabling the Dreamer module to roll out future states with the LNN dynamics.
- Training uses an asymmetric actor-critic setup with privileged critic and physics-informed Dreamer targets, while the expert actor interacts with the environment via PPO.
- Deploy an inverse-dynamics MPC solver that optimizes over joint trajectories to avoid mass matrix inversions during real-time planning.
実験結果
リサーチクエスチョン
- RQ1Can Lagrangian-based dynamics with inductive physics priors improve long-horizon planning for quadruped locomotion?
- RQ2Does integrating LNNs with an RL–MPC framework enhance sample efficiency and real-time planning compared to forward-dynamics or unstructured models?
- RQ3Is inverse-dynamics-based MPC feasible for real-time deployment on quadrupeds across diverse terrains?
- RQ4How does the proposed architecture perform on real hardware (Unitree Go1) across multiple terrains?
- RQ5What are the trade-offs between inference speed and planning performance in high-dimensional legged systems?
主な発見
- The framework achieves better sample efficiency and reduced long-horizon error compared to unstructured NN dynamics.
- Inverse-dynamics MPC reduces deployment latency by up to 4× relative to forward-dynamics LNN planners.
- The method maintains competitive returns across horizons and terrains, approaching DeLaN performance with significantly lower latency.
- Hardware experiments on Unitree Go1 demonstrate real-world viability across six terrain types.
- The approach yields stable multi-terrain performance and improved planning robustness over ONN baselines.
より良い研究を、今すぐ始めましょう
論文設計から論文執筆まで、研究時間を劇的に削減しましょう。
クレジットカード登録不要
このレビューはAIが作成し、人間の編集者が確認しました。