QUICK REVIEW

[论文解读] Efficient Active Search for Combinatorial Optimization Problems

André Hottung, Yeong‐Dae Kwon|arXiv (Cornell University)|Jun 9, 2021

Metaheuristic Optimization Algorithms Research参考文献 28被引用 27

一句话总结

提出三种高效的主动搜索（EAS）策略，在搜索过程中仅更新模型参数的子集，从而提高用于TSP、CVRP和JSSP的基于ML的构造方法的性能，并且通常超越最先进的ML方法，有时甚至超越LKH3。

ABSTRACT

Recently numerous machine learning based methods for combinatorial optimization problems have been proposed that learn to construct solutions in a sequential decision process via reinforcement learning. While these methods can be easily combined with search strategies like sampling and beam search, it is not straightforward to integrate them into a high-level search procedure offering strong search guidance. Bello et al. (2016) propose active search, which adjusts the weights of a (trained) model with respect to a single instance at test time using reinforcement learning. While active search is simple to implement, it is not competitive with state-of-the-art methods because adjusting all model weights for each test instance is very time and memory intensive. Instead of updating all model weights, we propose and evaluate three efficient active search strategies that only update a subset of parameters during the search. The proposed methods offer a simple way to significantly improve the search performance of a given model and outperform state-of-the-art machine learning based methods on combinatorial problems, even surpassing the well-known heuristic solver LKH3 on the capacitated vehicle routing problem. Finally, we show that (efficient) active search enables learned models to effectively solve instances that are much larger than those seen during training.

研究动机与目标

Motivate and address the high computational cost of Bello et al.'s active search by proposing efficient alternatives.
Develop three strategies to update only a subset of model parameters during test-time search.
Demonstrate that EAS variants improve solution quality and generalization across multiple combinatorial optimization problems.
Show that EAS can outperform state-of-the-art ML-based methods and even a strong heuristic (LKH3) on CVRP and JSSP.

提出的方法

Define three EAS variants: Embedding updates (EAS-Emb), Added-layer updates (EAS-Lay), and Tabular updates (EAS-Tab).
Each variant updates a small, instance-specific component while keeping the rest of the model fixed during search.
Use RL and imitation learning losses to guide updates: L_RL based on REINFORCE and L_IL from imitation of incumbent best solutions; combine as L_RIL = L_RL + λ L_IL.
For EAS-Emb: update a subset of instance embeddings with gradients; for EAS-Lay: insert an instance-specific residual layer and train its weights; for EAS-Tab: adjust a look-up table influencing action probabilities without backpropagation.
Evaluate on TSP (POMO-based), CVRP (POMO-based), and JSSP (L2D-based); compare against Concorde, LKH3, and several ML baselines.

实验结果

研究问题

RQ1Can efficient active search (updating only a subset of parameters) achieve competitive or superior solution quality to full active search?
RQ2Which EAS variant (Emb, Lay, Tab) offers the best trade-off between runtime and solution quality across TSP, CVRP, and JSSP?
RQ3How do EAS methods affect generalization to larger instances than those seen during training?
RQ4Do EAS methods enable ML construction methods to outperform strong solvers like LKH3 on CVRP and JSSP?

主要发现

Problem	Instance Set	n	Method	Objective (Avg)	Gap to Best/Opt (%)	Time (Wall)	Notes
TSP	Testing (10k inst.)	100	Concorde	7.765	0.000%	82M	Exact solver; baseline
TSP	Testing (10k inst.)	100	LKH3	7.765	0.000%	8H	Heuristic solver; baseline
TSP	Testing (10k inst.)	100	POMO-Greedy	7.776	0.146%	1M	Greedy baseline from POMO family
TSP	Testing (10k inst.)	100	POMO-Sampling	7.770	0.074%	4H	Sampling baseline from POMO family
TSP	Testing (10k inst.)	100	Active Search	7.768	0.046%	5D	Original active search; high cost
TSP	Testing (10k inst.)	100	EAS-Emb	7.769	0.063%	5H	EAS embedding updates
TSP	Testing (10k inst.)	100	EAS-Lay	7.769	0.053%	7H	EAS added-layer updates
TSP	Testing (10k inst.)	100	EAS-Tab	7.768	0.048%	5H	EAS tabular updates
CVRP	Testing (10k inst.)	100	LKH3	15.65	0.00%	6D	Baseline LKH3 on CVRP
CVRP	Testing (10k inst.)	100	POMO-Greedy	15.76	0.76%	2M	Greedy baseline
CVRP	Testing (10k inst.)	100	POMO-Sampling	15.67	0.17%	7H	Sampling baseline
CVRP	Testing (10k inst.)	100	Active Search	15.63	-0.07%	8D	Original active search; slower
CVRP	Testing (10k inst.)	100	EAS-Emb	15.63	-0.08%	9H	EAS embedding updates
CVRP	Testing (10k inst.)	100	EAS-Lay	15.61	-0.23%	12H	EAS added-layer updates
CVRP	Testing (10k inst.)	100	EAS-Tab	15.62	-0.14%	8H	EAS tabular updates
JSSP	Testing (100 inst.)	10x10	OR-Tools	807.6	0.0%	37S	Baseline OR-Tools
JSSP	Testing (100 inst.)	10x10	L2D-Greedy	988.6	22.3%	20S	Baseline L2D greedy
JSSP	Testing (100 inst.)	10x10	L2D-Sampling	871.7	8.0%	8H	Sampling baseline
JSSP	Testing (100 inst.)	10x10	Active Search	854.2	5.8%	8H	Original active search
JSSP	Testing (100 inst.)	10x10	EAS-Emb	837.0	3.7%	7H	EAS embedding updates
JSSP	Testing (100 inst.)	10x10	EAS-Lay	859.6	6.5%	7H	EAS added-layer updates
JSSP	Testing (100 inst.)	10x10	EAS-Tab	860.2	6.5%	8H	EAS tabular updates

EAS variants significantly reduce runtime compared to full active search while maintaining or improving solution quality.
On TSP, EAS-Emb, EAS-Lay, and EAS-Tab achieve gaps to Concorde/LKH3 comparable to or better than active search, with orders-of-magnitude faster runtimes.
On CVRP, EAS-Lay outperforms all baselines including LKH3 on test instances, and EAS-Tab achieves strong performance with faster runtimes; EAS-Tab shows sensitivity to α, needing tuning for some instances.
On JSSP, EAS-Emb yields the best performance among EAS variants, substantially reducing gaps versus sampling; EAS-Lay is competitive with active search; EAS-Tab lags on larger instances.
Across problems, EAS approaches improve generalization to larger instances by enabling effective search guidance without retraining the full model

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。