QUICK REVIEW

[论文解读] Small Shifts, Large Gains: Unlocking Traditional TSP Heuristic Guided-Sampling via Unsupervised Neural Instance Modification

Wei Min Huang, Hanchen Wang|arXiv (Cornell University)|Jan 31, 2026

Vehicle Routing Optimization Methods被引用 0

一句话总结

论文提出了 TSP-MDF，一种实例修改框架，通过无监督神经实例修改器为传统确定性 TSP 启发式算法提供引导采样，在最小训练量下实现接近神经网络的性能。

ABSTRACT

The Traveling Salesman Problem (TSP) is one of the most representative NP-hard problems in route planning and a long-standing benchmark in combinatorial optimization. Traditional heuristic tour constructors, such as Farthest or Nearest Insertion, are computationally efficient and highly practical, but their deterministic behavior limits exploration and often leads to local optima. In contrast, neural-based heuristic tour constructors alleviate this issue through guided-sampling and typically achieve superior solution quality, but at the cost of extensive training and reliance on ground-truth supervision, hindering their practical use. To bridge this gap, we propose TSP-MDF, a novel instance modification framework that equips traditional deterministic heuristic tour constructors with guided-sampling capability. Specifically, TSP-MDF introduces a neural-based instance modifier that strategically shifts node coordinates to sample multiple modified instances, on which the base traditional heuristic tour constructor constructs tours that are mapped back to the original instance, allowing traditional tour constructors to explore higher-quality tours and escape local optima. At the same time, benefiting from our instance modification formulation, the neural-based instance modifier can be trained efficiently without any ground-truth supervision, ensuring the framework maintains practicality. Extensive experiments on large-scale TSP benchmarks and real-world benchmarks demonstrate that TSP-MDF significantly improves the performance of traditional heuristics tour constructors, achieving solution quality comparable to neural-based heuristic tour constructors, but with an extremely short training time.

研究动机与目标

激发并解决传统确定性 TSP 启发式在探索和局部最优方面的局限性。
提出一个框架，通过神经实例修改实现对基准启发式的引导采样的增强。
在无需地面真值监督的情况下通过无监督学习与自模仿实现训练。
展示该方法在保持实用性的前提下，将传统启发式与基于神经的方法的性能拉近。

提出的方法

引入 TSP-MDF，增加一个预处理阶段，通过神经基的实例修改器对修改后的 TSP 实例进行采样，然后再应用传统启发式。
将用于节点修改的坐标偏移建模为离散化的多尺度类别分布，以实现可控的采样。
在无监督、自回归的方式下训练实例修改器，使用 REINFORCE（可选自模仿）引导修改朝向更短的巡回。
加入贪心的迭代改进，在最优修改实例的基础上继续生成新的修改，从而实现并行与序贯的引导采样。
提供一个可选的自模仿学习组件，将找到的最佳修改作为伪专家，以稳定早期训练并加速收敛。

实验结果

研究问题

RQ1传统的确定性 TSP 启发式是否可以通过修改输入实例而非重新设计启发式来实现引导采样的增强？
RQ2无监督的神经边实例修改器是否能有效采样出能够使基准启发式评估时巡回更短的修改实例？
RQ3离散化坐标偏移并采用自模仿是否能提高训练效率和探索质量？
RQ4通过实例修改实现的并行与序贯引导采样是否能达到与神经基巡回构造器相媲美的性能且训练时间较短？

主要发现

TSP-MDF 在大规模与真实世界 TSP 基准上显著提升传统确定性启发式的性能。
该框架在极短的训练时间和无需地面真值监督的条件下，达到与神经基启发式相当的解质量。
离散化坐标偏移并采用自模仿增强的训练策略稳定采样并加速收敛。
预处理的实例修改阶段实现了有效的引导采样，而无需重新设计基础启发式。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。