QUICK REVIEW

[논문 리뷰] AutoTAMP: Autoregressive Task and Motion Planning with LLMs as Translators and Checkers

Yongchao Chen, Jacob Arkin|arXiv (Cornell University)|2023. 06. 10.

Natural Language Processing Techniques인용 수 9

한 줄 요약

AutoTAMP는 자연어 작업을 STL 명세로 번역하고, 그다음 형식적 STL 플래너를 사용하여 공동 작업 및 모션 계획을 수행하며, 계획의 타당성을 높이기 위해 자기회귀식 의미 체계 오류 점검을 활용합니다.

ABSTRACT

For effective human-robot interaction, robots need to understand, plan, and execute complex, long-horizon tasks described by natural language. Recent advances in large language models (LLMs) have shown promise for translating natural language into robot action sequences for complex tasks. However, existing approaches either translate the natural language directly into robot trajectories or factor the inference process by decomposing language into task sub-goals and relying on a motion planner to execute each sub-goal. When complex environmental and temporal constraints are involved, inference over planning tasks must be performed jointly with motion plans using traditional task-and-motion planning (TAMP) algorithms, making factorization into subgoals untenable. Rather than using LLMs to directly plan task sub-goals, we instead perform few-shot translation from natural language task descriptions to an intermediate task representation that can then be consumed by a TAMP algorithm to jointly solve the task and motion plan. To improve translation, we automatically detect and correct both syntactic and semantic errors via autoregressive re-prompting, resulting in significant improvements in task completion. We show that our approach outperforms several methods using LLMs as planners in complex task domains. See our project website https://yongchao98.github.io/MIT-REALM-AutoTAMP/ for prompts, videos, and code.

연구 동기 및 목표

자연어 작업 설명을 TAMP 해석기가 실행할 수 있는 형식적 작업 명세로 번역할 수 있도록 한다.
하위 목표로 분해하지 않고 작업과 모션 계획을 공동 최적화한다.
자기회귀식 의미 점검 및 구문 수정으로 번역 품질을 개선한다.
경계가 까다로운 기하학적·시간적 제약이 있는 복합 2D 작업 영역에서 로버스트성을 평가한다.
재현성 및 향후 연구를 촉진하기 위한 데이터셋과 코드를 제공한다.

제안 방법

Few-shot in-context 학습을 사용하여 자연어 작업 설명을 STL로 번역한다.
실행 가능한 시간-경유점 궤적을 생성하기 위해 STL 기반 다중 에이전트 궤적 계획자를 사용한다.
두 가지 재프롬프트 기법을 적용한다: 구문 검증기를 통한 구문 오류 수정과 원래 지시에 대한 자기회귀식 의미 점검.
정의된 반복 한도까지 구문적·의미적 일관성이 달성될 때까지 STL 번역을 반복적으로 개선한다.
다양한 2D 도메인에서 엔드투엔드 LLM 계획 및 LLM 기반 작업 계획 기반선과 AutoTAMP를 비교한다.
데이터 효율성과 성능을 평가하기 위해 미세 조정된 NL2TL 번역 파이프라인과의 비교를 선택적으로 수행한다.

Figure 1: Illustration of different approaches applying LLMs for task and motion planning; our work contributes the LLM-As-Translator & Checker approach. Each approach accepts a natural language instruction and environment state as input and outputs a robot trajectory.

실험 결과

연구 질문

RQ1NL 작업을 STL로 번역하고 STL 플래너로 해결하는 것이 복잡한 TAMP 작업에서 엔드투엔드 LLM 계획보다 우수한가?
RQ2구문적 및 의미적 재프롬프트가 번역 품질과 작업 성공률에 어떤 영향을 미치는가?
RQ3AutoTAMP 접근법이 시간적 및 기하학적 제약을 가진 단일 에이전트 및 다중 에이전트 작업에 일반화되는가?
RQ4성능 및 데이터 효율성 측면에서 AutoTAMP가 NL-to-logic 번역기(NL2TL 등)와 어떻게 비교되는가?

주요 결과

구문 및 의미 재프롬포팅을 적용한 AutoTAMP는 교정 없는 번역보다 작업 성공률을 크게 향상시킨다.
경계가 까다로운 시간적 또는 기하 제약이 있는 단일 에이전트 2D 작업에서 AutoTAMP는 많은 시나리오에서 엔드투엔드 LLM 계획 및 단순 작업 계획보다 우수하다.
GPT-4 기반 번역이 일반적으로 GPT-3 기반 번역보다 우수하다.
의미 점검 자기회귀 프롬프트는 순수 구문 수정 및 무수정 기준선에 비해 상당한 이점을 제공한다.
변형 실험에서 재프롬프트를 통한 NL-to-STL 번역은 추가 학습 데이터 없이도 미세 조정된 NL2TL 파이프라인의 성능에 근접할 수 있음을 보인다.
실험에는 2D 및 3D 시뮬레이션과 실제 로봇에 대한 물리적 시연이 포함되어 실용적 적용 가능성을 뒷받침한다.

Figure 2: GPT-4 failure case for direct end-to-end trajectory planning. The orange line shows the correct path obeying the instruction. The purple and gray dashed lines show the trajectories from GPT-4 after first and second prompts, respectively. GPT-4 generates a list of $(x,y)$ locations with ass

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.