QUICK REVIEW

[論文レビュー] Learn to Design the Heuristics for Vehicle Routing Problem

Lei Gao, Mingxiang Chen|arXiv (Cornell University)|Feb 20, 2020

Vehicle Routing Optimization Methods参考文献 20被引用数 41

ひとこと要約

この論文はVRPの大規模近傍探索ヒューリスティックを学習するニューラルネットワークを提案し、中規模データで手作業ベースおよびニューラルベースラインを上回り、400ノードVRPを効果的に解く。

ABSTRACT

This paper presents an approach to learn the local-search heuristics that iteratively improves the solution of Vehicle Routing Problem (VRP). A local-search heuristics is composed of a destroy operator that destructs a candidate solution, and a following repair operator that rebuilds the destructed one into a new one. The proposed neural network, as trained through actor-critic framework, consists of an encoder in form of a modified version of Graph Attention Network where node embeddings and edge embeddings are integrated, and a GRU-based decoder rendering a pair of destroy and repair operators. Experiment results show that it outperforms both the traditional heuristics algorithms and the existing neural combinatorial optimization for VRP on medium-scale data set, and is able to tackle the large-scale data set (e.g., over 400 nodes) which is a considerable challenge in this area. Moreover, the need for expertise and handcrafted heuristics design is eliminated due to the fact that the proposed network learns to design the heuristics with a better performance. Our implementation is available online.

研究の動機と目的

監督付きの専門家データを用いずに、VRPの汎用的なdestroy/repairヒューリスティックの設計を自動化する。
非ユークリッドVRPインスタンスに対応するため、ノード情報とエッジ情報をグラフベースのエンコーダに統合する。
中規模および大規模VRP（例：>400ノード）へのスケーラビリティを、競争力のある性能で示す。

提案手法

エンコーダ–デコーダアーキテクチャを用い、エンコーダはノード情報とエッジ情報を組み合わせるEGATE（Element-wise Graph Attention Network with Edge Embedding）である。
destroy/repairヒューリスティックを確率的な逐次方策 π([η1,...,ηM])として表現し、PPOを用いたアクター–クリティック強化学習で学習する。
GRUベースのPointer Networkでデコードし、削除ノードの順序リストを出力して挿入順序を導く。
VRPコストの削減（距離プラス車両コスト）に等しい報酬を用いるアクター–クリティック設定で学習し、状態価値を推定する価値ネットワークを使用する。
学習時のSAガイド探索を可能にする、受理基準としてシミュレーテッドアニーリングベースの受理条件を適用する。

実験結果

リサーチクエスチョン

RQ1 neural network がVRPの汎用的で高レベルな destroy/repair ヒューリスティックを学習できるか（単一の手作業ヒューリスティックに代わるか）？
RQ2EGATEエンコーダを介したノード情報とエッジ情報の統合は、標準のGATベースのエンコーダよりVRP性能を改善するか？
RQ3学習したヒューリスティックは大規模VRPインスタンス（例：400ノード）にスケールし、従来のヒューリスティックや他のニューラル手法を上回るか？
RQ4CVRPおよびCVRPTW設定で、学習アプローチは手作業ヒューリスティックおよび既存のニューラル手法とどう比較されるか？

主な発見

モデル名	平均コスト
CVRP Random-1K	1188.14
CVRP ALNS-1K	1163.63
CVRP SISR-1K	1140.38
CVRP SISR-200K	1074.65
CVRP SISR-1M	1071.91
CVRP AM1280	1144.64
CVRP AM-Greedy	1189.76
CVRP EGATE-1K	1148.79
CVRP EGATE100-1K	1078.16
CVRPTW Random-1K	2567.85
CVRPTW ALNS-1K	2533.50
CVRPTW SISR-1K	2584.69
CVRPTW SISR-200K	2421.45
CVRPTW SISR-1M	2419.01
CVRPTW EGATE-1K	2537.23
CVRPTW EGATE256-1K	2415.16
CVRPTW 400 nodes Random-1K	7622.97
CVRPTW 400 nodes ALNS-1K	8095.00
CVRPTW 400 nodes SISR-1K	7900.75
CVRPTW 400 nodes SISR-1M	6630.10
CVRPTW 400 nodes EGATE-1K	7146.05
CVRPTW 400 nodes EGATE192-1K	6924.70

学習されたヒューリスティックアプローチは、CVRPにおいて1000回の反復でベンチマークにほぼ近く、ギャップ0.58%を達成。
CVRPTWでは、1000回の反復でテスト済みソルバーの中で最良の結果を出し、SISRベースラインを100万回の反復でも上回る。
中規模のCVRP/CVRPTWにおいて、同じ反復数の予算内で手作業ALNSおよびSISRベースラインを上回る。
モデルは400ノードのVRPにも対応できることを示し、そのスケールでベンチマークに対して4.4%のギャップという競争力のある結果を達成。
エンコーダ–デコーダ設計（EGATE＋GRUベースのPointer Network）は、専門家作成ヒューリスティックへの依存を減らし、汎用的な大規模近傍探索ヒューリスティックを学習可能にする。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。