Skip to main content
QUICK REVIEW

[论文解读] On the Hardness and Inapproximability of Recognizing Wheeler Graphs

Daniel Gibney, Sharma V. Thankachan|arXiv (Cornell University)|Jan 1, 2019
Algorithms and Data Compression参考文献 19被引用 10
一句话总结

本文建立了识别 Wheeler 图的计算不可解性,证明当边标签字母表大小 σ ≥ 2 时,识别问题在 DAG 上也是 NP-完全的。此外,其优化变体——Wheeler 图违反(WGV)和 Wheeler 子图(WS)——分别具有 APX-难和属于 APX 的性质,表明在高效近似方面存在根本限制。该研究提出了基于图同构的指数时间精确算法,并识别出一个可在多项式时间内求解的子类,突显了结构参数在可解性中的作用。

ABSTRACT

In recent years several compressed indexes based on variants of the Burrows-Wheeler transformation have been introduced. Some of these are used to index structures far more complex than a single string, as was originally done with the FM-index [Ferragina and Manzini, J. ACM 2005]. As such, there has been an increasing effort to better understand under which conditions such an indexing scheme is possible. This has led to the introduction of Wheeler graphs [Gagie et al., Theor. Comput. Sci., 2017]. Gagie et al. showed that de Bruijn graphs, generalized compressed suffix arrays, and several other BWT related structures can be represented as Wheeler graphs, and that Wheeler graphs can be indexed in a way which is space efficient. Hence, being able to recognize whether a given graph is a Wheeler graph, or being able to approximate a given graph by a Wheeler graph, could have numerous applications in indexing. Here we resolve the open question of whether there exists an efficient algorithm for recognizing if a given graph is a Wheeler graph. We present: - The problem of recognizing whether a given graph G=(V,E) is a Wheeler graph is NP-complete for any edge label alphabet of size sigma >= 2, even when G is a DAG. This holds even on a restricted, subset of graphs called d-NFA’s for d >= 5. This is in contrast to recent results demonstrating the problem can be solved in polynomial time for d-NFA’s where d <= 2. We also show the recognition problem can be solved in linear time for sigma =1; - There exists an 2^{e log sigma + O(n + e)} time exact algorithm where n = |V| and e = |E|. This algorithm relies on graph isomorphism being computable in strictly sub-exponential time; - We define an optimization variant of the problem called Wheeler Graph Violation, abbreviated WGV, where the aim is to remove the minimum number of edges in order to obtain a Wheeler graph. We show WGV is APX-hard, even when G is a DAG, implying there exists a constant C >= 1 for which there is no C-approximation algorithm (unless P = NP). Also, conditioned on the Unique Games Conjecture, for all C >= 1, it is NP-hard to find a C-approximation; - We define the Wheeler Subgraph problem, abbreviated WS, where the aim is to find the largest subgraph which is a Wheeler Graph (the dual of the WGV). In contrast to WGV, we prove that the WS problem is in APX for sigma=O(1); The above findings suggest that most problems under this theme are computationally difficult. However, we identify a class of graphs for which the recognition problem is polynomial time solvable, raising the open question of which parameters determine this problem’s difficulty.

研究动机与目标

  • 确定给定有向边标号图是否为 Wheeler 图的计算复杂性。
  • 分析优化变体的可近似性:最小化删除边数以获得 Wheeler 图(WGV),以及最大化 Wheeler 子图(WS)。
  • 识别能够实现 Wheeler 图多项式时间识别的结构参数。
  • 基于图同构开发识别与优化问题的精确指数时间算法。

提出的方法

  • 通过从反馈弧集(FAS)问题约化,证明 Wheeler 图识别的 NP-完全性:从 FAS 实例构造图,其边标签与顶点排序满足 Wheeler 公理当且仅当 FAS 实例可满足。
  • 通过从 FAS 约化证明 WGV 属于 APX-难:除非 P = NP,否则不存在常数因子近似算法;在唯一游戏假设下,甚至常数近似也难以实现。
  • 通过构造基于分支与源点及树形图的平面分层的线性时间 Ω(1/σ)-近似算法,证明 WS 对常数 σ 属于 APX。
  • 通过枚举所有可能的具有相同 n、e 和 σ 的 Wheeler 图编码,并检查与输入图的同构性,开发识别、WGV 和 WS 的指数时间精确算法。
  • 利用 Wheeler 图的空间高效编码意味着搜索空间有限,从而在图同构问题可亚指数时间求解时,保持同构检查的亚指数时间复杂度。
  • 利用队列数与 Wheeler 图之间的关系,证明当 σ = 1 时,识别问题可在 O(n + e) 时间内求解。

实验结果

研究问题

  • RQ1对于 σ ≥ 2,判断给定图是否为 Wheeler 图的问题是否为 NP-完全?
  • RQ2对于 d-NFA(d ≥ 5)等受限图类,Wheeler 图识别问题是否可在多项式时间内求解?
  • RQ3是否存在 Wheeler 图违反(WGV)问题的常数因子近似算法?
  • RQ4对于常数 σ,Wheeler 子图(WS)问题是否属于 APX?
  • RQ5是否存在 Wheeler 图识别或其优化变体的固定参数可解算法?

主要发现

  • 对于任意边标签字母表大小 σ ≥ 2,Wheeler 图识别问题为 NP-完全,即使输入图为 DAG 亦然。
  • 优化问题 WGV(即最小化删除边数以获得 Wheeler 图)为 APX-难,意味着除非 P = NP,否则不存在常数因子近似算法。
  • 在唯一游戏假设下,对任意 C ≥ 1,求解 WGV 的 C-近似问题为 NP-难。
  • 对偶问题 WS(即寻找最大 Wheeler 子图)对常数 σ 属于 APX,文中提供了线性时间 Ω(1/σ)-近似算法。
  • 当 σ = 1 时,Wheeler 图识别可在 O(n + e) 时间内求解,因为其退化为拓扑排序问题,且无边标签冲突。
  • 识别、WGV 和 WS 的指数时间精确算法运行时间为 2^{e log σ + O(n+e)},依赖于图同构检查以验证候选编码。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。