QUICK REVIEW

[论文解读] Integrated Exploration and Sequential Manipulation on Scene Graph with LLM-based Situated Replanning

Heqing Yang, Ziyuan Jiao|arXiv (Cornell University)|Feb 4, 2026

Multimodal Machine Learning Applications被引用 0

一句话总结

EPoG 将在场景图上的探索与序列操作结合起来，通过将基于图的全局规划器与基于LLM的情境局部再规划器耦合，在观测与LLM预测的基础上更新信念圖，以完成长时程任务。

ABSTRACT

In partially known environments, robots must combine exploration to gather information with task planning for efficient execution. To address this challenge, we propose EPoG, an Exploration-based sequential manipulation Planning framework on Scene Graphs. EPoG integrates a graph-based global planner with a Large Language Model (LLM)-based situated local planner, continuously updating a belief graph using observations and LLM predictions to represent known and unknown objects. Action sequences are generated by computing graph edit operations between the goal and belief graphs, ordered by temporal dependencies and movement costs. This approach seamlessly combines exploration and sequential manipulation planning. In ablation studies across 46 realistic household scenes and 5 long-horizon daily object transportation tasks, EPoG achieved a success rate of 91.3%, reducing travel distance by 36.1% on average. Furthermore, a physical mobile manipulator successfully executed complex tasks in unknown and dynamic environments, demonstrating EPoG's potential for real-world applications.

研究动机与目标

通过在场景图表示上统一探索与操作来解决部分可观测下的规划问题。
利用LLM来指导探索、预测未知对象的位置，并提供对异常情形的情境再规划。
通过对图编辑与动作排序的时序约束优化，以降低整体执行成本。

提出的方法

双层规划：全局规划器在信念图上生成候选操作序列，方法是通过图编辑距离(GED)与拓扑排序。
LLM信息驱动的EstimateBeliefGraph通过预测任务相关对象的可能位置来填充缺失节点。
基于图的规划器通过在信念图与目标图之间的GED以及受限的拓扑排序，计算最小成本序列。
本地的基于LLM的规划器处理运行时异常，采用情境性动作序列(LLMPlanner)。
每次观测后对信念图进行更新，在探索与操作之间以闭环方式交错。

实验结果

研究问题

RQ1在部分可观测的条件下，如何在基于图的场景表示上有效集成探索与顺序操作？
RQ2LLMs是否能够提升探索效率并为长时程操作任务提供鲁棒的情境再规划？
RQ3在纯LLM驱动规划、探索优先规划与集成的EPoG规划之间，成功率与执行成本方面有哪些权衡？
RQ4在任务执行过程中系统如何处理运动规划异常（如阻挡、不可达、碰撞、不稳定）？

主要发现

EPoG在46个家庭场景的五个长时程对象运输任务上实现了91.3%的成功率。
与探索+PoG基线相比，EPoG将探索的节点数量减少约40.0%，行进距离减少约36.2%。
纯LLM规划在处理长时程任务时表现较差，因为需要对大规模场景图进行推理以及空间/时间绑定的局限性。
该集成框架对未知对象状态和动态环境具有鲁棒性，且由真实世界移动操作器进行验证，具有实际应用性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。