QUICK REVIEW

[논문 리뷰] RePLan: Robotic Replanning with Perception and Language Models

Marta Skreta, Zihan Zhou|arXiv (Cornell University)|2024. 01. 08.

Robot Manipulation and Learning인용 수 5

한 줄 요약

RePLan은 계층적 LLM 기반 플래너, 비전-언어 모델 인식기, 검증기를 활용하여 장기 로봇 작업에 대한 실시간 재계획 및 보상 생성을 가능하게 하며, 여러 환경에서 강력한 성능을 보여준다.

ABSTRACT

Advancements in large language models (LLMs) have demonstrated their potential in facilitating high-level reasoning, logical reasoning and robotics planning. Recently, LLMs have also been able to generate reward functions for low-level robot actions, effectively bridging the interface between high-level planning and low-level robot control. However, the challenge remains that even with syntactically correct plans, robots can still fail to achieve their intended goals due to imperfect plans or unexpected environmental issues. To overcome this, Vision Language Models (VLMs) have shown remarkable success in tasks such as visual question answering. Leveraging the capabilities of VLMs, we present a novel framework called Robotic Replanning with Perception and Language Models (RePLan) that enables online replanning capabilities for long-horizon tasks. This framework utilizes the physical grounding provided by a VLM's understanding of the world's state to adapt robot actions when the initial plan fails to achieve the desired goal. We developed a Reasoning and Control (RC) benchmark with eight long-horizon tasks to test our approach. We find that RePLan enables a robot to successfully adapt to unforeseen obstacles while accomplishing open-ended, long-horizon goals, where baseline models cannot, and can be readily applied to real robots. Find more information at https://replan-lm.github.io/replan.github.io/

연구 동기 및 목표

자율적이고 장기 로봇 작업 수행을 최소한의 인간 개입으로 동기부여하고 가능하게 한다.
언어 모델과 시각적 기반으로 고수준 계획과 저수준 제어를 연결한다.
인지 피드백 및 검증을 도입해 계획 실패와 허위정보를 줄인다.
강화학습이 필요 없는 보상 생성 흐름으로 개방형 다단계 작업 해결을 시연한다.

제안 방법

고수준 LLM 플래너가 사용자 목표에서 추상 하위작업을 생성한다.
VLM 인식기가 grounding된 상태 관찰 및 객체 상태 피드백을 제공한다.
저수준 LLM 플래너가 고수준 하위작업을 저수준 보상 함수로 변환한다.
모션 컨트롤러(MuJoCo MPC)가 생성된 보상을 사용해 행동을 실행한다.
LLM 검증기가 플래너 출력 확인 및 수정하고 목표와 일치하는지 행동을 보장한다.

Figure 1: RePLan overview. It consists of five modules: a High-Level LLM Planner, a VLM Perceiver, a Low-Level LLM Planner for low-level reward generation, a motion controller with motor-control feedback, and an LLM Verifier. The robot’s task is to locate a green apple hidden in the microwave. Initi

실험 결과

연구 질문

RQ1LLM과 VLM을 사용하는 다단계 계획 시스템이 실시간 재계획으로 장기적이고 개방적인 로봇 작업을 수행할 수 있는가?
RQ2인지 기반 피드백 및 검증을 도입하면 베이스라인 LLM 기반 또는 비 grounding 접근법에 비해 작업 성공률과 로버스트성이 향상되는가?
RQ3MPC 기반 로봇 조작에서 제어를 위한 LLM 보상생성 파이프라인은 얼마나 효과적인가?
RQ4모듈(검증기, 인식기, 재계획)을 제거했을 때 장기 과제 수행에 어떤 영향이 있는가?

주요 결과

RePLan은 일곱 개의 작업에서 평균 성공률 88.6%를 달성하여 기준보다 상당히 높다.
Language to Rewards와 비교해 RePLan은 전체 작업 완료에서 3.6배 개선을 보였다.
제거 시 Verifier, Perceiver, 또는 Replan 모듈에서 성능 저하가 나타났으며, 특히 Perceiver 또는 Replan 제거 시 가장 큰 하락이 있었다.
작업별 성능은 달라지며(예: 작업 3이 가장 어렵다), 장애물 및 올바른 보상 우선순위 설정 같은 도전과제를 반영한다.
실패 후 재계획 및 보지 못한 장애물 처리 등 개방형 문제 해결을 시연한다.

Figure 2: Number of actions the robot executed in each task averaged over ten runs. Actions requiring the Perceiver are shown in pink while those executed using MPC are shown in purple. Standard deviations are shown using gray bars while the minimum and maximum number of actions are shown using gray

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.