[论文解读] SmoothTurn: Learning to Turn Smoothly for Agile Navigation with Quadrupedal Robots
SmoothTurn 通过使用序列性目标实现奖励、前瞻观察和自动课程学习,在仿真和真实四足动物上实现比单目标基线更快的穿越和更平滑的转向。
Quadrupedal robots show great potential for valuable real-world applications such as fire rescue and industrial inspection. Such applications often require urgency and the ability to navigate agilely, which in turn demands the capability to change directions smoothly while running in high speed. Existing approaches for agile navigation typically learn a single-goal reaching policy by encouraging the robot to stay at the target position after reaching there. As a result, when the policy is used to reach sequential goals that require changing directions, it cannot anticipate upcoming maneuvers or maintain momentum across the switch of goals, thereby preventing the robot from fully exploiting its agility potential. In this work, we formulate the task as sequential local navigation, extending the single-goal-conditioned local navigation formulation in prior work. We then introduce SmoothTurn, a learning-based control framework that learns to turn smoothly while running rapidly for agile sequential local navigation. The framework adopts a novel sequential goal-reaching reward, an expanded observation space with a lookahead window for future goals, and an automatic goal curriculum that progressively expands the difficulty of sampled goal sequences based on the goal-reaching performance. The trained policy can be directly deployed on real quadrupedal robots with onboard sensors and computation. Both simulation and real-world empirical results show that SmoothTurn learns an agile locomotion policy that performs smooth turning across goals, with emergent behaviors such as controlling momentum when switching goals, facing towards the future goal in advance, and planning efficient paths. We have provided video demos of the learned motions in the supplementary materials. The source code and trained policies will be made available upon acceptance.
研究动机与目标
- 通过在有障碍物的环境中实现跨局部目标的平滑转弯,来提高四足机器人在敏捷导航中的表现。
- 将序列局部导航公式化,以解决连续目标之间的动量与方向变化的问题。
- 开发一个强化学习框架,包含序列化奖励、前瞻观测与自动课程,以训练平滑转弯的运动。
- 在仿真和真实世界实验中,使用 Unitree Go2 与单目标基线进行对比来验证方法。
- 提供对在目标转换过程中的动量控制与前瞄朝向等涌现行为的见解。
提出的方法
- 以有序局部目标序列和放宽的多阈值到达条件来实现跨目标的连续运动的序列局部导航。
- 引入一种新颖的序列目标到达奖励,沿整个目标序列分配渐进进展并抑制停走-再走行为。
- 将前瞻窗口的未来目标加入观测,以实现轨迹感知的控制和动量管理。
- 实现一个自动目标课程,基于滚动的成功率来扩大目标距离和转向难度以稳定训练。
- 使用47维本体感知主干加一个n目标前瞻(主设置中 n=2)作为输入,输入给用 PPO 在 Isaac Gym 训练的强化学习策略,执行由 PD 控制器驱动。
- 在四个序列转向任务上,在仿真与真实环境的 Unitree Go2 上与单目标基线进行对比评估。

实验结果
研究问题
- RQ1如何将序列局部导航公式化,以在一系列局部目标间实现高速度下的平滑转向?
- RQ2结合前瞻观测的序列目标到达奖励是否能实现比单目标策略更平滑的转向和更快的穿越?
- RQ3自动课程与前瞻窗口对学习效率和涌现转向行为的影响?
- RQ4训练好的策略能否从仿真转移到真实四足机器人,并在真实导航任务中超越基线?
主要发现
- SmoothTurn 在仿真中的多条转向序列上优于单目标基线,表现为较低的跌倒率和更高的成功率且保持速度。
- 在合适的阈值设置下,SmoothTurn 保持动量并比基线更快完成目标序列,特别是在狭窄或急转弯时。
- 放宽的目标到达条件在 SmoothTurn 变体中仍然实现高成功率,表明对即将到来目标的前瞻朝向可进一步降低完成时间。
- 较小的前瞻窗口(n=2)和每个 episodes 训练 n=2 目标即可实现近似最优性能;更大的前瞻或训练量回报递减。
- 真实世界对 Unitree Go2 的实验证实了仿真结论:SmoothTurn 在四个转向任务中均比基线具有更短的穿越时间。
- 核心涌现行为包括在转弯时保持动量、提前朝向即将到来的目标,以及规划能利用容差实现更平滑过渡的高效路径。

更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。