QUICK REVIEW

[논문 리뷰] Energy-Efficient Thermal Comfort Control in Smart Buildings via Deep Reinforcement Learning

Guanyu Gao, Jie Li|arXiv (Cornell University)|2019. 01. 15.

Building Energy and Comfort Optimization참고 문헌 38인용 수 70

한 줄 요약

이 논문은 심층 강화학습 프레이워크를 Deep Deterministic Policy Gradients (DDPG)를 사용하여 연속 HVAC 제어에 적용하고, 열적 쾌적도 예측을 위한 Bayesian-regularized 신경망을 결합하여 occupant comfort를 유지하면서 에너지 사용을 줄이기 위해 TRNSYS 기반 빌딩 시뮬레이터에서 평가하였다.

ABSTRACT

Heating, Ventilation, and Air Conditioning (HVAC) is extremely energy-consuming, accounting for 40% of total building energy consumption. Therefore, it is crucial to design some energy-efficient building thermal control policies which can reduce the energy consumption of HVAC while maintaining the comfort of the occupants. However, implementing such a policy is challenging, because it involves various influencing factors in a building environment, which are usually hard to model and may be different from case to case. To address this challenge, we propose a deep reinforcement learning based framework for energy optimization and thermal comfort control in smart buildings. We formulate the building thermal control as a cost-minimization problem which jointly considers the energy consumption of HVAC and the thermal comfort of the occupants. To solve the problem, we first adopt a deep neural network based approach for predicting the occupants' thermal comfort, and then adopt Deep Deterministic Policy Gradients (DDPG) for learning the thermal control policy. To evaluate the performance, we implement a building thermal control simulation system and evaluate the performance under various settings. The experiment results show that our method can improve the thermal comfort prediction accuracy, and reduce the energy consumption of HVAC while improving the occupants' thermal comfort.

연구 동기 및 목표

스마트 빌딩에서 HVAC 에너지 소비를 줄이면서 점유자 열적 쾌적을 유지한다.
여러 영향 요인을 통합한 점유자 열적 쾌적 예측 모델을 개발한다.
연속 액션을 갖는 심층 강화학습을 활용하여 정밀한 HVAC 설정값 제어를 수행한다.
다양한 조건에서 건물 시뮬레이션 환경에서 접근법을 검증한다.

제안 방법

실내 상태 변수로부터 점유자의 열적 쾌적을 예측하기 위한 Bayesian-regularized feedforward 신경망을 개발한다.
에너지 최적화와 열적 쾌적을 에너지 사용과 쾌적 페널티를 결합한 비용(보상) 함수로 마르코프 결정 프로세스로 공식화한다.
온도와 습도의 연속 설정 제어를 위해 배우-비평가 구조를 갖는 Deep Deterministic Policy Gradients (DDPG)를 적용한다.
TRNSYS 기반 빌딩 시뮬레이션에서 재생 버퍼와 Ornstein-Uhlenbeck 탐색 노이즈를 사용하여 DDPG 에이전트를 학습시킨다.
합리적인 열적 쾌적 임계값 밖의 불편함과 HVAC 에너지 소비를 벌점하는 보상 함수를 사용한다 (M 은 [-D, D] 내).

실험 결과

연구 질문

RQ1연속 액션 DDPG 제어 정책이 기준선과 비교하여 HVAC 에너지를 줄이고 점유자 쾌적을 유지할 수 있는가?
RQ2베이지안 규제 신경망을 사용한 실내 환경 변수로부터의 열적 쾌적 예측 정확도는 얼마나 될까?
RQ3에너지-쾌적 가중 파라미터가 학습된 정책과 전체 성능에 미치는 영향은 무엇인가?
RQ4학습된 열적 쾌적 예측기를 피드백으로 통합하는 것이 모델 기반 접근법보다 제어 의사결정을 개선하는가?

주요 결과

제안된 방법은 DDPG와 함께 DNN 기반 열적 쾌적 예측기를 통합하여 에너지 최적화와 쾌적도 제어를 공동으로 수행한다.
신경망 예측기는 일반화 향상을 위해 베이지안 정규화를 사용한다.
시스템은 TRNSYS 기반 시뮬레이션에서 평가되어 이 접근법이 HVAC 에너지 소비를 줄이고 점유자 쾌적을 유지하거나 개선할 수 있음을 보여준다.
행동 공간은 정확한 HVAC 설정값을 위해 연속적으로 유지되며 대안 DRL 방법의 이산화 한계를 피한다.
보상에서 에너지 비용과 쾌적 페널티 간의 구성 가능한 거래 파라미터가 있어 점유자 필요에 맞춘 커스터마이즈를 가능하게 한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.