QUICK REVIEW

[论文解读] The complexity of all-switches strategy improvement

John Fearnley, Rahul Savani|arXiv (Cornell University)|Jan 10, 2016

Artificial Intelligence in Games参考文献 38被引用 7

一句话总结

本文研究了在各类博弈（包括偏好博弈、平均收益博弈、折现收益博弈及简单随机博弈）中，所有边切换策略改进算法的计算复杂度。研究证明，边切换问题与最优策略问题均为 PSPACE-完全问题，确立了这一广泛应用的算法方法的根本复杂度边界。

ABSTRACT

Strategy improvement is a widely-used and well-studied class of algorithms for solving graph-based infinite games. These algorithms are parametrized by a switching rule, and one of the most natural rules is all which switches as many edges as possible in each iteration. Continuing a recent line of work, we study all-switches strategy improvement from the perspective of computational complexity. We consider two natural decision problems, both of which have as input a game G, a starting strategy s, and an edge e. The problems are: 1. The edge switch problem, namely, is the edge e ever switched by all-switches strategy improvement when it is started from s on game G? 2. The optimal strategy problem, namely, is the edge e used in the final strategy that is found by strategy improvement when it is started from s on game G? We show PSPACE-completeness of the edge switch problem and optimal strategy problem for the following settings: Parity games with the discrete strategy improvement algorithm of Voge and Jurdzinski; mean-payoff games with the gain-bias algorithm [11, 33]; and discounted-payoff games and simple stochastic games with their standard strategy improvement algorithms. We also show PSPACE-completeness of an analogous problem to edge switch for the bottom-antipodal algorithm for Acyclic Unique Sink Orientations on Cubes.

研究动机与目标

分析图论无限博弈中所有边切换策略改进算法的计算复杂度。
确定在算法执行过程中，特定边是否会被切换，即边切换问题。
评估给定边是否属于算法生成的最终最优策略，即最优策略问题。
将复杂度分析扩展至立方体上无环唯一汇点定向的底-对顶点算法。
在包括偏好博弈、平均收益博弈、折现收益博弈及简单随机博弈在内的多个博弈类别中，确立两个决策问题的 PSPACE-完全性。

提出的方法

将所有边切换策略改进算法形式化为一种过程：在每次迭代中通过尽可能多地切换边来最大化策略改进。
将已知的 PSPACE-完全问题归约至边切换问题与最优策略决策问题，以证明其难解性。
构建从 PSPACE-完全问题到这两个决策问题的多项式时间归约，以证明两者均属于 PSPACE。
利用 Voge 和 Jurdzinski 的离散策略改进算法，分析偏好博弈中策略改进路径的结构。
通过收益-偏差算法将复杂度分析扩展至平均收益博弈，通过标准策略改进算法将分析扩展至折现收益博弈与简单随机博弈。
对立方体上无环唯一汇点定向的底-对顶点算法应用类似技术，证明其相应的 PSPACE-完全性。

实验结果

研究问题

RQ1边切换问题（即判断特定边是否在所有边切换策略改进过程中被切换）是否为 PSPACE-完全问题？
RQ2最优策略问题（即判断特定边是否出现在最终策略中）是否为 PSPACE-完全问题？
RQ3PSPACE-完全性结果是否可扩展至立方体上无环唯一汇点定向的底-对顶点算法？
RQ4复杂度结果在不同博弈类型（包括偏好博弈、平均收益博弈、折现收益博弈及简单随机博弈）中是否一致？
RQ5是否能够高效地验证或预测所有边切换策略改进算法在执行过程中的边使用情况？

主要发现

在偏好博弈中，利用 Voge 和 Jurdzinski 的离散策略改进算法，边切换问题为 PSPACE-完全问题。
在平均收益博弈中，利用收益-偏差算法，最优策略问题为 PSPACE-完全问题。
在折现收益博弈与简单随机博弈中，利用其标准策略改进算法，边切换问题与最优策略问题均为 PSPACE-完全问题。
PSPACE-完全性结果可扩展至立方体上无环唯一汇点定向的底-对顶点算法。
复杂度结果在多个博弈类别中均成立，表明预测所有边切换策略改进中边的行为存在根本性的计算障碍。
研究结果表明，即使在最坏情况下判断单条边是否被切换或是否被用于最终策略，也是计算上不可行的。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。