QUICK REVIEW

[論文レビュー] The pros and cons of using deep reinforcement learning or genetic algorithms to design control schemes for quantum state transfer on qubit chains

Sofía Perón Santana, Ariel Fiuri|arXiv (Cornell University)|Jan 9, 2026

Quantum Information and Cryptography被引用数 0

ひとこと要約

この論文は、ゲノムアルゴリズム（GA）と深層強化学習（DRL）を、キュビット鎖上で量子状態を転送する外部コントロールを設計する際に比較し、GAは高忠実度の高速転送を達成できる一方、DRLはノイズ耐性を示すが長い鎖では困難で計算コストが高くなり得る、と結論付ける。

ABSTRACT

In recent years, control methods based on different optimization techniques have shed light on the possibilities of processing information in many quantum systems. When exploring the transmission of quantum states, faster transmission times are mandatory to avoid the deleterious effects of multiple sources of decoherence that spoil the transmission process. In particular, using Reinforcement Learning to devise sequences of step-wise external controls provides good transfer policies at short transmission times. We present two approaches to control the transmission of quantum states in qubit chains using external controls to force the dynamical evolution of the chain state. The first approach relies on the well-known Genetic Algorithm to generate a sequence of external controls, while the second approach uses a variant of Reinforcement Learning. The Genetic algorithm achieves excellent transmission fidelity at as short transmission times as Reinforcement Learning, surpassing the fidelities achieved by the latter method. Nevertheless, the Reinforcement Learning method offers robust control policies when the control pulses are noisy enough, owing to an imperfect timing of the pulses, deficient control devices, or other sources of phase decoherence. We present the regime where each method is best suited to control the transmission of arbitrary qubit states.

研究の動機と目的

デコヒーレンスを緩和するため、キュビット鎖における量子状態転送の最適化ベースの制御の利用を動機づける。
外部コントロール列を生成するGAとDRLのアプローチを比較する。
揺らぎ下での性能を特徴づけ、各手法が優れる領域を特定する。
高速かつ堅牢な量子状態転送のために、GAとDRLのどちらを選ぶべきかについて指針を提供する。

提案手法

XXハamiltonianと、制御として作用するpiecewise-constantな外部場h_i(t)を用いてキュビット鎖をモデル化する。
制御列を染色体（GA）またはMDPフレームワーク内のDeep Q-network（DRL）として表現する。
GAは適応度を時間窓内の最大伝送確率として制御列の集団を進化させて評価する。
Q-network、ターゲットネットワーク、リプレイメモリを備えたDeep Q-Networkを用いてDRLを実装し、行動価値の推定を学習する。
個々のサイト制御と固定セットの行動スキームを比較し、鎖長に対する性能を分析する。
ノイズや不完全制御条件下でDRLモデルを訓練・検証して、揺らぎに対する頑健性を評価する。

Figure 1: The cartoon in the figure depicts a system of $N$ qubits and its time evolution. The initial state, shown at the leftmost extreme of the cartoon, corresponds to a one-excitation quantum state. The step-wise evolution operator for a given interval, $U_{k}=U(\tau_{k})$ , acts over all the qu

実験結果

リサーチクエスチョン

RQ1GA由来の制御列とDRL由来の列は、均一なキュビット鎖上で高忠実度の量子状態転送を達成する点でどう比較されるか。
RQ2短い伝送時間と長い伝送時間、およびノイズ・不完全制御条件下で、どの手法がより良い性能を示すか。
RQ3鎖長と制御パラメータのどの領域がGAをDRLより有利または不利にするか。
RQ4学習された制御方針は外部の揺らぎやハードウェアの制御変動に対してどれだけ頑健か。
RQ5この問題におけるGAとDRLの計算コストのトレードオフはどの程度か。

主な発見

GAは短い伝送時間で卓越した伝送忠実度を達成でき、しばしばDRLの性能と同等かそれを上回る。
サイト別制御GAは、鎖長全体で忠実度と頑健性の点でZhangらのアクション集合を上回る。
DRL（DQN）は長い鎖に対して高品質な状態転送を生成するのが難しくなる一方、短い鎖では量子速度限界に近づくケースもある。
揺らぎのある環境で訓練されたDRL方針はノイズに対して頑健だが、訓練実行ごとにばらつきが生じやすく、計算コストが大きくなる可能性がある。|
長い鎖の場合、GAはより速い収束と高忠実度転送の信頼性を提供する一方、DRLは頑健性を提供する代わりに計算負荷と一貫性の課題を伴う。
揺らぎが存在する場合、DRLで訓練された方針は性能を維持できる一方、GA列は訓練に揺らぎを組み込まない限り劣化する可能性がある。

Figure 2: The cartoon in the Figure presents the main ingredients of the Genetic Algorithm. a) The sixteen possible actions, each of which can appear on a control sequence at any position in it. b) An initial population of four individuals, each one endowed with its own chromosome. The chromosome co

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。