QUICK REVIEW

[논문 리뷰] UPath: Universal Planner Across Topological Heterogeneity For Grid-Based Pathfinding

Aleksandr Ananikian, Daniil Drozdov|arXiv (Cornell University)|2026. 02. 27.

Robotic Path Planning Algorithms인용 수 0

한 줄 요약

UPath는 다양한 격자 토폴로지에 일반화되는 A*용 보편 보정 계수 휴리스틱을 학습하여, unseen 작업 분포에서 최적 대비 약 3% 이내의 비용을 유지하면서 최대 2.2배 적은 확장 수를 달성합니다.

ABSTRACT

The performance of search algorithms for grid-based pathfinding, e.g. A*, critically depends on the heuristic function that is used to focus the search. Recent studies have shown that informed heuristics that take the positions/shapes of the obstacles into account can be approximated with the deep neural networks. Unfortunately, the existing learning-based approaches mostly rely on the assumption that training and test grid maps are drawn from the same distribution (e.g., city maps, indoor maps, etc.) and perform poorly on out-of-distribution tasks. This naturally limits their application in practice when often a universal solver is needed that is capable of efficiently handling any problem instance. In this work, we close this gap by designing an universal heuristic predictor: a model trained once, but capable of generalizing across a full spectrum of unseen tasks. Our extensive empirical evaluation shows that the suggested approach halves the computational effort of A* by up to a factor of 2.2, while still providing solutions within 3% of the optimal cost on average altogether on the tasks that are completely different from the ones used for training $\unicode{x2013}$ a milestone reached for the first time by a learnable solver.

연구 동기 및 목표

격자 기반 경로 탐색에서 탐색 확장을 줄이기 위해 인스턴스 인식 휴리스틱의 필요성 필요성 제시.
Out-of-distribution 맵에 일반화되는 한 번 학습하는 보편 휴리스틱 예측기 제안.
검색 절차를 바꾸지 않고 표준 A* 플래너와 보정 계수 휴리스틱을 통합.
다양하고 토폴로지가 풍부한 벤치마크(UPF)에서 일반화 성능을 평가하고 baselines와 비교.

제안 방법

cf*(n) = h_oct(n) / h*(n)로 정의되는 보정 계수 정의: h_oct는 오클타일 휴리스틱이고 h*은 목표 기반 Dijkstra 패스의 실제 비용-가-
엔코더–트랜스포머–디코더 네트워크와 긴 스킵 연결을 가진 밀집 cf(n) 맵 예측; 마스킹된 회귀 손실로 비장애물, 비목표 셀에 대해 cf*(n) 학습.
예측된 cf를 A* 안내를 위한 사용 가능한 휴리스틱 h_hat(n) = h_oct(n) / max(cf_hat(n), epsilon)으로 변환.
일반화를 촉진하고 단일 토폴로지에 대한 과적합을 피하기 위해 간단한 절차적 사전(Uniform, Beta, Beta-Figures)에서 세 모델 학습.
Cross-domain 일반화와 강건성 테스트를 위한 10가지 토폴로지 타입의 20,000-task 평가 세트인 UPF를 생성.

실험 결과

연구 질문

RQ1한 가지 신경망 휴리스틱이 보이지 않는 다양한 격자 토폴로지와 분포에 걸쳐 일반화할 수 있는가?
RQ2일반화된 예측기가 전통적인 A* 및 가중치 A*, 그리고 분포 이동하에 최첨단 학습 플래너와 어떻게 비교되는가?
RQ3더 큰 격자(예: 128x128)에서도 효율성과 거의 최적성 유지를 유지하는가?
RQ4Robust한 일반화 성능을 위한 설계 선택(예: 스킵 연결, 손실 마스킹)은 어떤 요인을 결정하는가?

주요 결과

UPath는 UPF 과제에서 일반적인 A* 대비 검색 확장을 최대 2.2배까지 감소시킵니다.
해당 경로는 최적으로 근접하게 유지되며, 평균적으로 최적 대비 비용이 약 3% 증가합니다.
Beta+Fig 변형은 최적 해 found의 비율 72.63%, 비용 101.1% ± 4.1인 반면 확장 수는 47.4% ± 27.7로 가장 좋은 전체 트레이드오프를 보입니다.
Beta 변형은 확장을 가장 낮게 달성하지만 비용은 105.1% ±16.2로 더 높고 최적 해의 비율은 55.24%에 그칩니다.
가중치가 큰 WA*와 비교할 때 확장은 줄어들지만 최적성은 저하되고 비용은 증가합니다; UPF의 평가 분포에 민감해 TransPath는 더 나쁜 성능을 보입니다.
UPath 모델은 토폴로지가 다양한 UPF 벤치마크에서 WA* 두 기초 baselines와 TransPath를 능가하며 강건한 일반화를 보입니다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.