QUICK REVIEW

[논문 리뷰] Constrained Multi-objective Optimization with Deep Reinforcement Learning Assisted Operator Selection

Fei Ming, Wenyin Gong|arXiv (Cornell University)|2024. 01. 15.

Advanced Multi-Objective Optimization Algorithms인용 수 5

한 줄 요약

이 논문은 제약 다목적 최적화 진화 알고리즘(CMOEAs)에 대해 딥 Q-러닝 기반 온라인 연산자 선택 프레임워크를 도입하고, 여러 벤치마크 및 기존 CMOEAs에서 성능을 향상시킨다고 제시한다.

ABSTRACT

Solving constrained multi-objective optimization problems with evolutionary algorithms has attracted considerable attention. Various constrained multi-objective optimization evolutionary algorithms (CMOEAs) have been developed with the use of different algorithmic strategies, evolutionary operators, and constraint-handling techniques. The performance of CMOEAs may be heavily dependent on the operators used, however, it is usually difficult to select suitable operators for the problem at hand. Hence, improving operator selection is promising and necessary for CMOEAs. This work proposes an online operator selection framework assisted by Deep Reinforcement Learning. The dynamics of the population, including convergence, diversity, and feasibility, are regarded as the state; the candidate operators are considered as actions; and the improvement of the population state is treated as the reward. By using a Q-Network to learn a policy to estimate the Q-values of all actions, the proposed approach can adaptively select an operator that maximizes the improvement of the population according to the current state and thereby improve the algorithmic performance. The framework is embedded into four popular CMOEAs and assessed on 42 benchmark problems. The experimental results reveal that the proposed Deep Reinforcement Learning-assisted operator selection significantly improves the performance of these CMOEAs and the resulting algorithm obtains better versatility compared to nine state-of-the-art CMOEAs.

연구 동기 및 목표

CMOP에서 연산자 선택이 성능에 중대한 영향을 미치는 상황에서 적응형 연산자 선택의 필요성을 제시한다.
DRL 기반 프레임워크를 개발하여 진화 연산자를 자동으로 선택하고 수렴성, 다양성 및 실행 가능성을 향상시킨다.
프레임워크를 여러 인기 있는 CMOEAs에 삽입하고 도전적인 CMOP 벤치마크에서 평가한다.
42개의 문제에 걸쳐 최첨단 CMOOEAs와 비교하여 향상된 성능과 범용성을 보여준다.

제안 방법

상태를 집단 수렴(con), 실행 가능성(fea), 다양성(div)으로 정의한다.
연산자를 DRL(Deep Q-Learning) 프레임워크의 행동으로 모델링한다.
이터레이션 전후의 집단 상태 차이를 보상으로 사용하여 전반적인 개선을 캡처한다.
주어진 상태에서 연산자 선택의 값(Q-값)을 추정하기 위해 심층 Q-네트워크를 학습한다.
DRL 기반 연산자 선택을 CCMO, PPS, MOEA/D-DAE, EMCMO의 네 가지 CMOEAs에 삽입한다.
경험 재생 및 주기적 DQN 업데이트를 통한 온라인 학습 루프를 제공하여 진화하는 집단 역학에 적응한다.

Figure 1: An illustration of two types of working principles of the DQL technique.

실험 결과

연구 질문

RQ1DRL 기반의 온라인 연산자 선택이 다양한 CMOP에 걸쳐 제약 다목적 EAs의 성능을 향상시킬 수 있는가?
RQ2수렴성, 다양성, 실행 가능성을 반영하도록 상태, 행동, 보상을 어떻게 설계해야 하는가?
RQ3제안된 DRL 보조 프레임워크가 여러 CMOOEAs 및 벤치마크 스위트에 일반화될 수 있는가?
RQ4이 접근법이 42개 문제에 걸쳐 기존의 최첨단 CMOOEAs보다 더 높은 범용성을 제공하는가?

주요 결과

DRL 보조 연산자 선택이 임베디드된 CMOEAs의 성능을 크게 향상시킨다.
프레임워크가 벤치마크 문제에서 9개의 최첨단 CMOOEAs에 비해 더 높은 범용성을 보여준다.
경험 재생을 활용한 온라인 학습 루프로 현재의 집단 상태에 맞춰 연산자 선택을 적응시킨다.
이 방법은 임의의 수의 연산자를 도입할 수 있으며 다양한 CMOEAs와 호환된다.

Figure 2: The illustration of the proposed DQL model.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.