QUICK REVIEW

[论文解读] Practical and efficient quantum circuit synthesis and transpiling with Reinforcement Learning

David Kremer, V. Villar|arXiv (Cornell University)|May 21, 2024

Quantum Computing Algorithms and Architecture被引用 7

一句话总结

论文引入强化学习（RL）来合成和路由量子电路，在 Cliff ord、线性函数和置换电路上实现近最优结果，显著降低路由开销，且性能远快于 SAT 求解器。

ABSTRACT

This paper demonstrates the integration of Reinforcement Learning (RL) into quantum transpiling workflows, significantly enhancing the synthesis and routing of quantum circuits. By employing RL, we achieve near-optimal synthesis of Linear Function, Clifford, and Permutation circuits, up to 9, 11 and 65 qubits respectively, while being compatible with native device instruction sets and connectivity constraints, and orders of magnitude faster than optimization methods such as SAT solvers. We also achieve significant reductions in two-qubit gate depth and count for circuit routing up to 133 qubits with respect to other routing heuristics such as SABRE. We find the method to be efficient enough to be useful in practice in typical quantum transpiling pipelines. Our results set the stage for further AI-powered enhancements of quantum computing workflows.

研究动机与目标

激励将人工智能工具整合到量子计算工作流中，以提升转译与电路优化。
构建一个通用的 RL 框架，在设备连通性约束下为 Clifford、Linear Function 和 Permutation 电路提供近最优的电路综合。
展示基于 RL 的电路路由，在保持计算效率的同时改善两量子比特门深度和数量。
展示 RL 方法在真实转译流水线中的实用性，并讨论可扩展性及与现有工具的集成。

提出的方法

将电路综合框定为一个序列决策过程，在该过程中代理选择门以将算子化简至单位算符。
使用课程学习训练 RL 代理，逐步处理更难的目标算子，达到单位算符的奖励以及门数量/深度的惩罚。
推理策略包括贪心、采样以及基于代理输出概率的 top-k/top-p 门选择。
用 Clifford 表板表示 Clifford 电路（仅矩阵，忽略相位），并训练神经网络在连接性约束下将算子表示映射到门操作。
将 RL 框架扩展到电路路由，将 SWAP 视为动作，优化布局和门指标。
将基于 RL 的方法与 SAT 求解器和启发式方法进行比较，并对 Clifford、Permutation、Linear Function 电路以及路由任务进行基准测试。

Figure 1: Diagram describing the RL-based circuit synthesis process.

实验结果

研究问题

RQ1强化学习是否能够在原生设备约束与连通性下合成接近最优的 Clifford、Linear Function 与 Permutation 电路？
RQ2与像 SABRE 这样的现有路由启发式方法相比，基于 RL 的电路路由在两量子比特门深度和数量方面的表现如何？
RQ3就运行时和可扩展性而言，RL 方法是否适合集成到真实的量子转译流水线中？
RQ4当扩展到更大的量子比特数量和更大规模的电路时，RL 合成与路由的性能与可扩展性极限是什么？
RQ5一个统一的 RL 框架能否跨不同电路族与连通图泛化，还是需要针对拓扑的特定训练？

主要发现

RL 合成在受限连通性下为 Clifford 电路实现接近最优的 CNOT 数与深度（示例：7 量子比特 Clifford 与 H 连通性）。
对于置换电路，在基准测试中，RL 在 8-L 拓扑和 65-HH/27-HH 变体下达到 100% 的最优 SWAP 数与深度，且运行时间显著快于 SAT 求解器。
对于线性函数电路，RL 在至多 9 量子比特时表现为近最优；Clifford 至 11 量子比特；Permutation 至 65 量子比特，路由演示至 133 量子比特。
RL 路由在使用线性连通性的 8–10 量子体积电路路由中，相较于基于 BIP 的路由在 CNOT 深度上约降低 20%，并维持或略微提高 CNOT 数；8 次迭代的 RL 路由在深度和门数上可超过标准转译器。
通用 RL 路由在将 133-quantum bit circuits routed to IBM Torino topology 时，CNOT 深度再减少约 40%，两量子比特门数量降低约 10%，相对于 Qiskit SDK 三级转译器。
该方法的运行速度比基于 SAT 的优化快数量级（例如秒对小时），同时在质量方面明显优于启发式方法，适合在 AI-enabled 转译工作流中实际部署。

Figure 2: Training progress for Clifford synthesis on 7 qubits with “H” connectivity. The horizontal axis shows the progress of the training in terms of total number of steps taken (number of Cliffords “seen” by the model). The vertical axes on the different graphs represent how different quantities

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。