QUICK REVIEW

[论文解读] Learning to Explore using Active Neural SLAM

Devendra Singh Chaplot, Dhiraj Gandhi|arXiv (Cornell University)|Apr 10, 2020

Robot Manipulation and Learning被引用 220

一句话总结

Active Neural SLAM 构建了一个模块化、层次化的导航系统，包含一个学习型 Neural SLAM 模块、一个 Global 策略和一个 Local 策略，在探索方面达到最先进水平并成功迁移到 PointGoal 任务。

ABSTRACT

This work presents a modular and hierarchical approach to learn policies for exploring 3D environments, called `Active Neural SLAM'. Our approach leverages the strengths of both classical and learning-based methods, by using analytical path planners with learned SLAM module, and global and local policies. The use of learning provides flexibility with respect to input modalities (in the SLAM module), leverages structural regularities of the world (in global policies), and provides robustness to errors in state estimation (in local policies). Such use of learning within each module retains its benefits, while at the same time, hierarchical decomposition and modular training allow us to sidestep the high sample complexities associated with training end-to-end policies. Our experiments in visually and physically realistic simulated 3D environments demonstrate the effectiveness of our approach over past learning and geometry-based approaches. The proposed model can also be easily transferred to the PointGoal task and was the winning entry of the CVPR 2019 Habitat PointGoal Navigation Challenge.

研究动机与目标

Motivate exploration efficiency in unknown 3D environments and robustness to state-estimation errors.
Propose a modular architecture that combines a learned SLAM module with classical planning.
Leverage hierarchical decision making to reduce sample complexity compared to end-to-end learning.
Demonstrate transfer to PointGoal navigation and real-world applicability.

提出的方法

Introduce a three-component architecture: Neural SLAM module, Global policy, and Local policy, interfaced via a map and an analytic planner.
Neural SLAM comprises a Mapper and Pose Estimator that predict an egocentric map and pose from RGB and sensor data.
Global policy consumes the map and pose to output long-term goals which are converted to short-term goals by a planner using Fast Marching Method.
Local policy is a learned policy (with a ResNet18 encoder) that maps RGB observations to actions to reach the short-term goal.
Training is modular: map/pose supervision for SLAM, RL for the Global policy, and imitation learning for the Local policy, enabling sample efficiency.

实验结果

研究问题

RQ1How can learning be integrated into a classical navigation pipeline to improve exploration efficiency?
RQ2Does a modular, hierarchical setup with a learned SLAM and policies outperform end-to-end learning baselines in 3D exploration tasks?
RQ3Can the approach generalize across domains (e.g., Gibson to Matterport) and transfer to PointGoal tasks without retraining?
RQ4What is the impact of each module (SLAM, Global policy, Local policy) on performance and robustness to sensor/actuation noise?

主要发现

The Active Neural SLAM model outperforms baselines on exploration metrics in both Gibson and MP3D domains.
Hierarchical modular design reduces search space and improves sample efficiency compared to end-to-end baselines.
The method exhibits strong domain generalization, transferring Gibson-trained policies to Matterport with improved coverage.
The approach transfers to PointGoal navigation without additional training and wins the CVPR 2019 Habitat PointGoal Navigation Challenge.
Ablation studies show the Local Policy and the pose estimation supervision contribute to robustness and long-horizon planning.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。