[论文解读] EGAM: Extended Graph Attention Model for Solving Routing Problems
EGAM 通过多头注意力同时更新节点和边的嵌入,使用 REINFORCE 及对称性基线进行训练,以解决路由问题,在严格约束下表现尤为出色。
Neural combinatorial optimization (NCO) solvers, implemented with graph neural networks (GNNs), have introduced new approaches for solving routing problems. Trained with reinforcement learning (RL), the state-of-the-art graph attention model (GAM) achieves near-optimal solutions without requiring expert knowledge or labeled data. In this work, we generalize the existing graph attention mechanism and propose the extended graph attention model (EGAM). Our model utilizes multi-head dot-product attention to update both node and edge embeddings, addressing the limitations of the conventional GAM, which considers only node features. We employ an autoregressive encoder-decoder architecture and train it with policy gradient algorithms that incorporate a specially designed baseline. Experiments show that EGAM matches or outperforms existing methods across various routing problems. Notably, the proposed model demonstrates exceptional performance on highly constrained problems, highlighting its efficiency in handling complex graph structures.
研究动机与目标
- Motivate solving NP-hard routing problems with neural combinatorial optimization without labeled data.
- Generalize GAMs by incorporating edge information through Node-Edge and Edge-Node attention.
- Develop an encoder–decoder autoregressive architecture trained with policy gradient methods.
- Demonstrate improved performance on TSP, CVRP, PCTSP, and constrained variants like TSPTW, TSPDL, and VRPTW.
提出的方法
- Introduce Edge-Node and Node-Edge attention to update edge embeddings alongside node embeddings.
- Use a generalized multi-head dot-product attention in integrated encoder layers (Node-Node, Edge-Node, Node-Edge).
- Employ an autoregressive encoder–decoder architecture for route generation.
- Train with REINFORCE using a symmetry-based baseline to avoid labeled data.
- Decode with context-aware attention and masking to enforce feasibility during routing decisions.

实验结果
研究问题
- RQ1How does incorporating edge features via Node-Edge and Edge-Node attention affect routing performance compared to node-only GAMs?
- RQ2Can EGAM achieve near-optimal or better solutions for standard and highly constrained routing problems using reinforcement learning without labeled data?
- RQ3What is the impact of greedy vs. sampling inference in EGAM across different problem types and constraints?
- RQ4How does the symmetry-based baseline influence training efficiency and convergence in RL-based routing solvers?
主要发现
| Method | Type | TSP Cost | TSP Gap | TSP Time | CVRP Cost | CVRP Gap | CVRP Time | PCTSP Cost | PCTSP Gap | PCTSP Time |
|---|---|---|---|---|---|---|---|---|---|---|
| EGAM (Ours) | Greedy | 5.72 | 0.49% | 6s | 10.72 | 3.29% | 7s | 4.51 | 0.81% | 6s |
| EGAM (Ours) | 1280 Sampling | 5.70 | 0.03% | 2.29m | 10.48 | 1.01% | 2.4m | 4.48 | 0.11% | 2.3m |
| GAM | Greedy | 5.80 | 1.76% | 4s | 10.98 | 5.86% | 4s | 4.60 | 2.84% | 3s |
| GATv2 | Greedy | 5.77 | 1.33% | 3s | 10.90 | 5.04% | 3s | 4.56 | 1.95% | 2s |
| POMO | Greedy | 5.73 | 0.64% | 5s | 10.74 | 3.54% | 6s | / | / | / |
- EGAM matches or outperforms existing methods on several routing problems, particularly under tight constraints.
- In TSP, CVRP, and PCTSP, EGAM with greedy and sampling strategies achievescompetitive costs and small gaps relative to state-of-the-art methods.
- On highly constrained problems (e.g., TSPTW, TSPDL, VRPTW), EGAM shows notable improvements in cost, feasibility, and solution quality.
- Edge feature integration yields improved modeling of transition relationships, improving performance on complex graph structures.
- EGAM demonstrates competitive scalability with autoregressive and potential non-autoregressive extensions.

更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。