QUICK REVIEW

[논문 리뷰] From Scanning Guidelines to Action: A Robotic Ultrasound Agent with LLM-Based Reasoning

Yuan Bi, Yiping Zhou|arXiv (Cornell University)|2026. 03. 15.

Soft Robotics and Applications인용 수 0

한 줄 요약

RL-미세조정된 LLM은 가이드라인 기반의 고수준 플래너 역할을 하여 도구 세트를 사용해 로봇 초음파 스캐닝을 자율적으로 제어하며, 구두 가이드 테스트와 실제 세계의 담낭, 척추, 신장 스캔에서 시연되었습니다.

ABSTRACT

Robotic ultrasound offers advantages over free-hand scanning, including improved reproducibility and reduced operator dependency. In clinical practice, US acquisition relies heavily on the sonographer's experience and situational judgment. When transferring this process to robotic systems, such expertise is often encoded explicitly through fixed procedures and task-specific models, yielding pipelines that can be difficult to adapt to new scanning tasks. In this work, we propose a unified framework for autonomous robotic US scanning that leverages a LLM-based agent to interpret US scanning guidelines and execute scans by dynamically invoking a set of provided software tools. Instead of encoding fixed scanning procedures, the LLM agent retrieves and reasons over guideline steps from scanning handbooks and adapts its planning decisions based on observations and the current scanning state. This enables the system to handle variable and decision-dependent workflows, such as adjusting scanning strategies, repeating steps, or selecting the appropriate next tool call in response to image quality or anatomical findings. Because the reasoning underlying tool selection is also critical for transparent and trustworthy planning, we further fine tune the LLM agent using a RL based strategy to improve both its reasoning quality and the correctness of tool selection and parameterization, while maintaining robust generalization to unseen guidelines and related tasks. We first validate the approach via verbal execution on 10 US scanning guidelines, assessing reasoning as well as tool selection and parameterization, and showing the benefit of RL fine tuning. We then demonstrate real world feasibility on robotic scanning of the gallbladder, spine, and kidney. Overall, the framework follows diverse guidelines and enables reliable autonomous scanning across multiple anatomical targets within a unified system.

연구 동기 및 목표

가이드라인 기반의 계획을 사용해 운영 의존도를 낮추고 재현성을 향상시키는 자동 로봇 초음파를 추진한다.
핸드북에서 스캔 단계를 검색하고 현재 관찰을 바탕으로 도구 사용을 결정하는 에이전트를 만든다.
단계를 반복하고 전략을 조정하며 탐침 동작을 조정하는 유연하고 고정되지 않은 워크플로우를 가능하게 한다.
더 나은 추론 및 도구 선택을 위한 RL로 미세조정된 작은 모델의 신뢰성과 일반화를 강화한다

제안 방법

검색 증강 생성(RETRIEVAL-augmented generation)을 통해 가이드라인 단계를 검색하는 고수준 플래너로 LLM을 사용한다.
결정을 <think> 추론 흔적 다음에 <tool> JSON 호출로 표현하여 인식 및 로봇 제어 유틸리티 도구킷에 연결한다.
도구를 세 가지 범주로 구현한다: 궤도 계획(Trajectory Planning), 로봇 실행(Robot Execution), 품질/환자 상호작용(Quality/Patient Interaction) (접촉 조정 및 음성 안내).
참조 흔적에 대한 지도학습으로 작은 모델을 미세조정한 다음, 도구 호출 정확도와 추론을 개선하기 위해 맞춤 보상을 갖춘 Proximal Policy Optimization(PPO)을 적용한다.
도구 사용 정확도 및 작업 성공에 대한 보유 guideline 평가를 통해 검증한 뒤, 담낭, 척추, 신장에 대한 실제 로봇 스캔에서 엔드 투 엔드 검증을 수행한다.
올바르지 않은 도구 이름을 페널티하고 올바른 도구 및 매개변화에 보상을 주는 보상 구조를 도입하여 보이지 않는 가이드라인에 대한 강건한 일반화를 가능하게 한다

실험 결과

연구 질문

RQ1LLM 주도 에이가 다양한 초음파 스캐닝 가이드라인을 해석하여 자율 로봇 US를 위한 실행 가능한 도구 호출을 생성할 수 있는가?
RQ2RL 미세조정이 기본 모델에 비해 도구 선택 및 매개변수화의 일관성과 정확성을 향상시키는가?
RQ3여러 해부 표적(담낭, 척추, 신장)에 걸친 엔드투엔드 자율 스캐닝이 견고하게 가능하는가?
RQ4가이드라인 기반의 계획이 실제 로봇 초음파에 얼마나 잘 이전되어 성공률과 안전성으로 나타나는가?

주요 결과

모델	단계별 정확도	전체 성공률
Base Model	0.6512	0.5384
After Fine Tuning	0.8973	0.9230

RL 미세조정은 단계별 도구 사용 정확도를 0.6512에서 0.8973으로 향상시킨다.
RL 미세조정은 전체 스캔 작업 성공률을 0.5384에서 0.9230으로 향상시킨다.
음성 실험에서 RL 미세조정으로 더 일관되고 작업 관련 추론 흔적이 나타난다.
실제 로봇 실험에서 자원봉사자를 대상으로 척추 및 신장 스캔이 성공적으로 수행되었으며 담낭 스캔은 절차의 복잡성으로 한 차례 실패했다.
시스템은 담낭, 척추, 신장에 걸친 통합 파이프라인 내에서 신뢰성 있게 자율 스캐닝을 시연한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.