QUICK REVIEW

[论文解读] A Survey on Traffic Signal Control Methods

Hua Wei, Guanjie Zheng|arXiv (Cornell University)|Apr 17, 2019

Traffic control and management参考文献 65被引用 177

一句话总结

本综述审视经典交通运输工程方法与基于强化学习的交通信号控制方法，详细讨论单一代理和多代理设置下的问题表述、方法及 RL 基础。

ABSTRACT

Traffic signal control is an important and challenging real-world problem, which aims to minimize the travel time of vehicles by coordinating their movements at the road intersections. Current traffic signal control systems in use still rely heavily on oversimplified information and rule-based methods, although we now have richer data, more computing power and advanced methods to drive the development of intelligent transportation. With the growing interest in intelligent transportation using machine learning methods like reinforcement learning, this survey covers the widely acknowledged transportation approaches and a comprehensive list of recent literature on reinforcement for traffic signal control. We hope this survey can foster interdisciplinary research on this important topic.

研究动机与目标

总结传统交通运输工程在交通信号控制方面的方法及其假设。
强调在现实世界、动态交通面前，基于规则和基于优化的方法的局限性。
介绍交通信号控制的强化学习基础，并比较单一代理与多代理的表述。
就 RL 基于交通信号控制的数据、状态表示和奖励设计提供指南。

提出的方法

综述经典方法（Webster、GreenWave、Maxband、Actuated、SOTL、Max-pressure、SCATS），及它们的输入、输出和约束。
描述用于协调信号的基于周期的时序、偏移和带宽概念。
提出交通信号控制的 RL 框架，包括 MDP、Q-learning，以及多代理设置的随机博弈。
解释 RL 如何在隔离及多路口情景中整合状态、动作、奖励和转移动力学。

实验结果

研究问题

RQ1传统的基于优化和基于规则的交通信号控制方法有哪些及其局限性？
RQ2如何将强化学习用于单路口和多路口交通信号控制，其关键设计选择（状态、动作、奖励）是什么？
RQ3用于 RL 基于交通信号控制的实际数据源和建模考虑有哪些？
RQ4多代理 RL 框架（随机博弈）如何应用于协调网络交通信号控制？
RQ5在 RL 基于方法与经典交通方法之间存在哪些基准或比较？

主要发现

该综述将经典方法映射到其数据输入和输出，阐明何时采用固定时长或按键控策略。
GreenWave 和 Maxband 展示了循环长度约束和基于带宽的协调信号推进。
Actuated 与 SOTL 方法依赖请求和阈值以实时调整相位，而 Max-pressure 通过队列长度压力平衡来提升网络吞吐量。
SCATS 被描述为一种预定义计划的方法，基于性能指标进行迭代计划选择。
为单一和多代理交通信号控制提出了 RL 基础，概述 MDP 和随机博弈形式以及奖励的作用。
论文强调需要整合更丰富的移动性数据和计算能力，以在交通信号控制中实现数据驱动的 RL 方法。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。