QUICK REVIEW

[논문 리뷰] Small Shifts, Large Gains: Unlocking Traditional TSP Heuristic Guided-Sampling via Unsupervised Neural Instance Modification

Wei Min Huang, Hanchen Wang|arXiv (Cornell University)|2026. 01. 31.

Vehicle Routing Optimization Methods인용 수 0

한 줄 요약

논문은 TSP-MDF를 도입하여 기존의 결정적 TSP 휴리스틱에 안내 샘플링을 가능하게 하는 unsupervised 신경 계 instance modification으로 무-training에 가까운 네트워크 수준의 성능을 달성합니다.

ABSTRACT

The Traveling Salesman Problem (TSP) is one of the most representative NP-hard problems in route planning and a long-standing benchmark in combinatorial optimization. Traditional heuristic tour constructors, such as Farthest or Nearest Insertion, are computationally efficient and highly practical, but their deterministic behavior limits exploration and often leads to local optima. In contrast, neural-based heuristic tour constructors alleviate this issue through guided-sampling and typically achieve superior solution quality, but at the cost of extensive training and reliance on ground-truth supervision, hindering their practical use. To bridge this gap, we propose TSP-MDF, a novel instance modification framework that equips traditional deterministic heuristic tour constructors with guided-sampling capability. Specifically, TSP-MDF introduces a neural-based instance modifier that strategically shifts node coordinates to sample multiple modified instances, on which the base traditional heuristic tour constructor constructs tours that are mapped back to the original instance, allowing traditional tour constructors to explore higher-quality tours and escape local optima. At the same time, benefiting from our instance modification formulation, the neural-based instance modifier can be trained efficiently without any ground-truth supervision, ensuring the framework maintains practicality. Extensive experiments on large-scale TSP benchmarks and real-world benchmarks demonstrate that TSP-MDF significantly improves the performance of traditional heuristics tour constructors, achieving solution quality comparable to neural-based heuristic tour constructors, but with an extremely short training time.

연구 동기 및 목표

전통적인 결정적 TSP 휴리스틱의 탐색 및 지역 최적점에서의 한계를 동기 부여하고 해결합니다.
기본 휴리스틱을 신경 기반 인스턴스 수정으로 안내 샘플링을 보강하는 프레임워크를 제시합니다.
그라운드 트룻(supervision) 없이도 무감독 학습과 자기 모방(self-imitation)을 활용하여 학습을 가능하게 합니다.
전통적 휴리스틱과 신경 기반 방법 간의 성능 차이를 실용성을 유지하면서 연결하는 접근법임을 보여줍니다.

제안 방법

TSP-MDF를 도입하여 기존의 휴리스틱을 적용하기 전에 신경 기반 인스턴스 수정기를 통해 수정된 TSP 인스턴스를 샘플링하는 전처리 단계를 추가합니다.
노드 수정 좌표 오프셋을 이산화된 다중 스케일 범주 분포로 모델링하여 샘플링을 계산적으로 가능하게 합니다.
REINFORCE를 이용한 비지도적 자회귀 방식으로 인스턴스 수정기를 학습시키되 선택적으로 자기 모방을 사용하여 더 짧은 투어를 향해 수정을 안내합니다.
가장 좋은 수정 인스턴스를 통해 새로운 수정을 반복적으로 생성하도록 하는 탐색을 병렬 및 순차적으로 가능하게 하는 그리디적 반복 개선을 도입합니다.
초기 학습을 안정시키고 수렴 속도를 높이기 위해 최고의 수정 인스턴스를 의사 전문가로 활용하는 선택적 자기 모방 학습 구성 요소를 제공합니다.

실험 결과

연구 질문

RQ1입력 인스턴스를 재설계하는 대신 수정으로 안내 샘플링을 통해 전통적 결정적 TSP 휴리스틱을 강화할 수 있는가?
RQ2비지도적 신경 기반 인스턴스 수정기가 수정된 인스턴스의 샘플링을 효과적으로 가능하게 하여 기본 휴리스틱으로 평가할 때 더 짧은 투어를 이끌어내는가?
RQ3좌표 오프셋을 이산화하고 자기 모방을 적용하는 것이 학습 효율성 및 탐색 품질을 향상시키는가?
RQ4인스턴스 수정 접근을 통한 병렬 및 순차적 안내 샘플링이 짧은 학습 시간으로 신경 기반 투어 구성자와 비슷한 성능을 달성하게 하는가?

주요 결과

TSP-MDF는 대규모 및 실제 세계의 TSP 벤치마크에서 전통적 결정론적 휴리스틱의 성능을 크게 향상시킵니다.
이 프레임워크는 신경 기반 휴리스틱과 비교할 수 있는 해법 품질을 달성하면서도 매우 짧은 학습 시간과 무그라운드-트룻 감독이 필요하지 않습니다.
좌표 오프셋 이산화와 자기 모방 강화 학습 전략은 샘플링을 안정시키고 수렴 속도를 가속화합니다.
전처리 인스턴스 수정 단계는 기본 휴리스틱을 재설계하지 않고도 효과적인 안내 샘플링을 가능하게 합니다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.