QUICK REVIEW

[论文解读] Deep Reinforcement Learning for Optimal Control of Space Heating

Ádám Nagy, Hussain Kazmi|arXiv (Cornell University)|May 10, 2018

Building Energy and Comfort Optimization参考文献 21被引用 53

一句话总结

一种新颖的用于空间供暖控制的深度强化学习算法，计算效率高，与其他技术进行基准比较，在不同价格信号下相对于基于规则的控制提高了5–10%。

ABSTRACT

Classical methods to control heating systems are often marred by suboptimal performance, inability to adapt to dynamic conditions and unreasonable assumptions e.g. existence of building models. This paper presents a novel deep reinforcement learning algorithm which can control space heating in buildings in a computationally efficient manner, and benchmarks it against other known techniques. The proposed algorithm outperforms rule based control by between 5-10% in a simulation environment for a number of price signals. We conclude that, while not optimal, the proposed algorithm offers additional practical advantages such as faster computation times and increased robustness to non-stationarities in building dynamics.

研究动机与目标

开发一种基于 DRL 的空间供暖控制策略，该策略不依赖于精确的建筑模型。
将 DRL 控制器与传统基于规则的控制以及其他控制技术进行基准比较。
评估建筑动态的非平稳性对计算效率和鲁棒性的影响。
在多种价格信号场景下评估性能。

提出的方法

提出一种新颖的、专门用于空间供暖控制的深度强化学习算法。
在仿真中将所提 DRL 方法与基于规则的控制及其他技术进行基准比较。
使用具有不同价格信号的仿真环境来评估性能。
比较计算效率以及对建筑动态非平稳性的鲁棒性。
从能源成本降低和对动态条件的适应性方面分析结果。

实验结果

研究问题

RQ1在多个价格信号下，基于 DRL 的控制器是否能相对于基于规则的控制实现更低的空间供暖能源成本？
RQ2所提出的 DRL 控制器在计算效率和对非平稳建筑动态的鲁棒性方面是否具备良好表现？
RQ3在动态价格信号下，DRL 方法相对于现有控制技术的表现如何？
RQ4除了最优性外，DRL 方法还提供哪些实际优势（如鲁棒性、适应性）？

主要发现

所提出的 DRL 算法在多种价格信号下的仿真中相对于基于规则的控制提高了5–10%。
该算法相对于其他方法在计算上更高效。
DRL 方法对建筑动态的非平稳性显示出更强的鲁棒性。
与其他已知技术进行基准比较显示出具有竞争力的性能。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。