QUICK REVIEW

[论文解读] Two-Timescale Voltage Control in Distribution Grids Using Deep Reinforcement Learning

Qiuling Yang, Gang Wang|arXiv (Cornell University)|Apr 19, 2019

Optimal Power Flow Distribution参考文献 27被引用 37

一句话总结

本文提出一个两时间尺度方案，联合控制电容器（慢时间尺度通过深度强化学习）和智能逆变器（快时间尺度通过凸优化）以在配电网中调节电压，在47-bus网络和IEEE 123-bus馈线上进行验证。

ABSTRACT

Modern distribution grids are currently being challenged by frequent and sizable voltage fluctuations, due mainly to the increasing deployment of electric vehicles and renewable generators. Existing approaches to maintaining bus voltage magnitudes within the desired region can cope with either traditional utility-owned devices (e.g., shunt capacitors), or contemporary smart inverters that come with distributed generation units (e.g., photovoltaic plants). The discrete on-off commitment of capacitor units is often configured on an hourly or daily basis, yet smart inverters can be controlled within milliseconds, thus challenging joint control of these two types of assets. In this context, a novel two-timescale voltage regulation scheme is developed for distribution grids by judiciously coupling data-driven with physicsbased optimization. On a faster timescale, say every second, the optimal setpoints of smart inverters are obtained by minimizing instantaneous bus voltage deviations from their nominal values, based on either the exact alternating current power flow model or a linear approximant of it; whereas, on the slower timescale (e.g., every hour), shunt capacitors are configured to minimize the longterm discounted voltage deviations using a deep reinforcement learning algorithm. Extensive numerical tests on a real-world 47- bus distribution network as well as the IEEE 123-bus test feeder using real data corroborate the effectiveness of the novel scheme.

研究动机与目标

动机并解决由于高比例可再生能源和电动汽车渗透所引起的配电网电压波动。
提出一个混合数据驱动与物理驱动的框架，在两个时间尺度上联合控制电容器和逆变器。
通过深度强化学习学习慢时间尺度的电容器配置。
使用精确的交流模型或线性化潮流模型计算逆变器的无功功率设定值。
通过对真实世界和标准测试馈线的广泛仿真来证明有效性。

提出的方法

建立一个两时间尺度的优化问题：慢时间尺度的电容器开关决策通过深度强化学习（DRL）学习，快时间尺度的逆变器无功功率设定值通过凸优化计算。
使用带有 SOCP 松弛的精确 AC 模型或线性化的配电潮流模型来获得逆变器设定值。
将电容器决策建模为马尔可夫决策过程，并使用带目标网络和经验回放增强的深度 Q 网络（DQN）方法求解。
对逆变器约束和电容器配置进行表示，以确保在每个时隙提供可行的无功功率支撑。
引入一个超深度 Q 网络以应对来自大量电容器的庞大动作空间。

实验结果

研究问题

RQ1在没有完整配电知识的情况下，两时间尺度的控制框架能否在随机负载和发电下高效且有效地调控母线电压？
RQ2慢时间尺度的电容器开关决策的 DRL 是否能与快时间尺度的逆变器优化协同，以最小化长期电压偏差？
RQ3在所提出的方案中，基于精确 AC 的模型与线性化潮流模型在实现快速逆变器设定值优化方面有何比较？
RQ4像目标网络和经验回放这样的修改能否在电力系统情境中稳定 DRL？

主要发现

两时间尺度方案可以联合管理电容器和逆变器，以减少配电网的电压偏差。
在给定当前电容器配置的情况下，快时间尺度的逆变器设定值可以通过基于精确 AC 的 SOCP 或线性化二次规划来计算。
慢时间尺度的电容器决策通过一个 DRL 方法（带深度 Q 网络的 DRL）学习，考虑随机发电和负载。
该方法采用目标网络和经验回放来稳定学习并改善收敛。
该框架通过对真实世界的 47-bus 配电网络和 IEEE 123-bus 测试馈线的大量数值测试进行验证，使用真实数据。
该方法通过一个超深度 Q 网络来解决电容器动作空间维度灾难问题。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。