[論文レビュー] Sim-to-Real Robot Learning from Pixels with Progressive Nets
本論文は、simulation から real robot へ end-to-end, pixel-to-action policies を transfer することを、 progressive networks を用いて実現し、 sparse rewards での実機での高速学習を可能にすることを示す。
Applying end-to-end learning to solve complex, interactive, pixel-driven control tasks on a robot is an unsolved problem. Deep Reinforcement Learning algorithms are too slow to achieve performance on a real robot, but their potential has been demonstrated in simulated environments. We propose using progressive networks to bridge the reality gap and transfer learned policies from simulation to the real world. The progressive net approach is a general framework that enables reuse of everything from low-level visual features to high-level policies for transfer to new tasks, enabling a compositional, yet simple, approach to building complex skills. We present an early demonstration of this approach with a number of experiments in the domain of robot manipulation that focus on bridging the reality gap. Unlike other proposed approaches, our real-world experiments demonstrate successful task learning from raw visual input on a fully actuated robot manipulator. Moreover, rather than relying on model-based trajectory optimisation, the task learning is accomplished using only deep reinforcement learning and sparse rewards.
研究の動機と目的
- Motivate and address the reality gap in end-to-end pixel-to-action robot control learned via deep reinforcement learning.
- Propose progressive networks as a transfer-learning framework to reuse learned features and policies across tasks and domains.
- Show, through real-robot experiments, that progressive nets accelerate learning on a fully actuated robot manipulator with sparse rewards.
提案手法
- Use an actor-critic network trained in simulation with RGB inputs and joint velocity outputs.
- Instantiate a new column (network) for the real-robot task with lateral connections from the simulation column.
- Initialise the real-robot output layer to mirror the simulation column to bias exploration.
- Allow columns to have differing capacities to accommodate sim-to-real differences.
- Evaluate across tasks and perturbations to compare progressive transfer with finetuning and from-scratch learning.
- Demonstrate ability to extend to proprioceptive inputs by adding a column that uses proprioception while reusing visual features via lateral connections.
実験結果
リサーチクエスチョン
- RQ1Can progressive networks transfer learned policies from simulation to a real robot when trained with pixel inputs and sparse rewards?
- RQ2Do progressive networks enable faster and more stable real-robot learning compared to finetuning or learning from scratch?
- RQ3How does adding or changing input modalities (e.g., proprioception) affect transfer performance within a progressive network framework?
- RQ4Is the approach robust to environment perturbations and curriculum-like task variations?
主な発見
- The progressive second column achieves higher real-robot performance (34 points) than a finetuned column or from-scratch baselines.
- Randomly initialized columns fail to learn on the real robot, showing the need for transfer scaffolding.
- Progressive networks demonstrate greater stability and higher final performance than finetuning under environment changes.
- Adding proprioceptive inputs can be integrated via a new column while reusing visual features through lateral connections, enabling improved transfer to dynamic tasks.
- Transfer via progressive nets reduces the required real-robot training time from scratch by leveraging simulation-trained features.
より良い研究を、今すぐ始めましょう
論文設計から論文執筆まで、研究時間を劇的に削減しましょう。
クレジットカード登録不要
このレビューはAIが作成し、人間の編集者が確認しました。