Skip to main content
QUICK REVIEW

[论文解读] Constrained Multi-objective Optimization with Deep Reinforcement Learning Assisted Operator Selection

Fei Ming, Wenyin Gong|arXiv (Cornell University)|Jan 15, 2024
Advanced Multi-Objective Optimization Algorithms被引用 5
一句话总结

本文提出了一种基于深度Q学习的在线算子选择框架,用于受约束的多目标优化进化算法(CMOEAs),并在多个基准测试和现有CMOEAs上显示出性能提升。

ABSTRACT

Solving constrained multi-objective optimization problems with evolutionary algorithms has attracted considerable attention. Various constrained multi-objective optimization evolutionary algorithms (CMOEAs) have been developed with the use of different algorithmic strategies, evolutionary operators, and constraint-handling techniques. The performance of CMOEAs may be heavily dependent on the operators used, however, it is usually difficult to select suitable operators for the problem at hand. Hence, improving operator selection is promising and necessary for CMOEAs. This work proposes an online operator selection framework assisted by Deep Reinforcement Learning. The dynamics of the population, including convergence, diversity, and feasibility, are regarded as the state; the candidate operators are considered as actions; and the improvement of the population state is treated as the reward. By using a Q-Network to learn a policy to estimate the Q-values of all actions, the proposed approach can adaptively select an operator that maximizes the improvement of the population according to the current state and thereby improve the algorithmic performance. The framework is embedded into four popular CMOEAs and assessed on 42 benchmark problems. The experimental results reveal that the proposed Deep Reinforcement Learning-assisted operator selection significantly improves the performance of these CMOEAs and the resulting algorithm obtains better versatility compared to nine state-of-the-art CMOEAs.

研究动机与目标

  • Motivate adaptive operator selection for CMOPs where operator choice critically affects performance.
  • Develop a DRL-based framework that autonomously selects evolutionary operators to improve convergence, diversity, and feasibility.
  • Embed the framework into multiple popular CMOEAs and evaluate on challenging CMOP benchmarks.
  • Demonstrate improved performance and versatility compared to state-of-the-art CMOOEAs across 42 problems.

提出的方法

  • Define state as population convergence (con), feasibility (fea), and diversity (div).
  • Model operators as actions in a DRL (Deep Q-Learning) framework.
  • Use reward as the difference in the population state before/after an iteration to capture overall improvement.
  • Train a Deep Q-Network to estimate action values (Q-values) for operator selection in given states.
  • Embed the DRL-based operator selection into four CMOEAs: CCMO, PPS, MOEA/D-DAE, and EMCMO.
  • Provide an online learning loop with experience replay and periodic DQN updates to adapt to evolving population dynamics.
Figure 1: An illustration of two types of working principles of the DQL technique.
Figure 1: An illustration of two types of working principles of the DQL technique.

实验结果

研究问题

  • RQ1Can DRL-based online operator selection improve the performance of constrained multi-objective EAs across diverse CMOPs?
  • RQ2How should state, action, and reward be designed to account for convergence, diversity, and feasibility in CMOPs?
  • RQ3Is the proposed DRL-assisted framework generalizable to multiple CMOOEAs and benchmark suites?
  • RQ4Does the approach offer better versatility than existing state-of-the-art CMOOEAs across 42 problems?

主要发现

  • The DRL-assisted operator selection significantly improves the performance of embedded CMOEAs.
  • The framework demonstrates better versatility compared to nine state-of-the-art CMOOEAs across benchmark problems.
  • The method uses an online learning loop with experience replay to adapt operator choices to the current population state.
  • The approach can incorporate an arbitrary number of operators and is compatible with various CMOEAs.
Figure 2: The illustration of the proposed DQL model.
Figure 2: The illustration of the proposed DQL model.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。