QUICK REVIEW

[論文レビュー] End-To-End Latent Variational Diffusion Models for Inverse Problems in High Energy Physics

Alexander Shmakov, Kevin Thomas Greif|arXiv (Cornell University)|May 17, 2023

Generative Adversarial Networks and Image Synthesis被引用数 23

ひとこと要約

Variational Latent Diffusion (VLD) を用いて高次元の LHC 観測データを真のパートン分布へとエンドツーエンドで展開する手法を導入し、ベースラインより分布レベルの忠実度が改善されることを示す。

ABSTRACT

High-energy collisions at the Large Hadron Collider (LHC) provide valuable insights into open questions in particle physics. However, detector effects must be corrected before measurements can be compared to certain theoretical predictions or measurements from other detectors. Methods to solve this extit{inverse problem} of mapping detector observations to theoretical quantities of the underlying collision are essential parts of many physics analyses at the LHC. We investigate and compare various generative deep learning methods to approximate this inverse mapping. We introduce a novel unified architecture, termed latent variation diffusion models, which combines the latent learning of cutting-edge generative art approaches with an end-to-end variational framework. We demonstrate the effectiveness of this approach for reconstructing global distributions of theoretical kinematic quantities, as well as for ensuring the adherence of the learned posterior distributions to known physics constraints. Our unified approach achieves a distribution-free distance to the truth of over 20 times less than non-latent state-of-the-art baseline and 3 times less than traditional latent diffusion models.

研究の動機と目的

高エネルギー物理学における展開（逆問題）と検出器レベルから真実のレベルデータへの非ビン化・高次元マッピングの必要性を動機づける。
潜在拡散と変分オートエンコーダ、および物理情報を組み合わせた統一的なエンドツーエンドの Variational Latent Diffusion (VLD) フレームワークを提案する。
ベースラインと比較して全体的な分布忠実度と半難治性 t tbar イベントの物理的に一貫した事後分布を改善することを実証する。

提案手法

conditioning encoder、条件付きまたは無条件の VAE、拡散過程を単一の目的関数に統合する Variational Latent Diffusion (VLD) を導入する。
学習可能なノイズスケジュールとデノイジングネットワークを用いた連続時間・分散保持拡散を採用し、元データを予測する。
質量・エネルギー・運動量の関係として M^2 = E^2 - ||p||^2 を満たすよう、物理 informed の整合性損失を組み込む。
エンドツーエンドの訓練バリエーションを探索する：VLD、UC-VLD（unconditional decoder）、および C-VLD（conditional encoder/decoder）。
検出器の観測によって潜在空間を条件づけるための permutation-invariant jet transformer encoder と潜在パートンエンコーダ/デコーダを使用する。
semi-leptonic t tbar データに対して複数の距離指標（Wasserstein、Energy、KS、KL with 64/128/256 bins）で評価する。

実験結果

リサーチクエスチョン

RQ1エンドツーエンドの variational latent diffusion は高次元の検出器データを真のレベルのパートン分布へ展開するのに有効か。
RQ2 conditioning encoder、VAE、拡散の共同訓練は、別個の部品より分布レベルの忠実度と物理的一貫性を向上させるか。
RQ3条件付け戦略（無条件デコーダ vs 条件付きデコーダ）が再構成品質と事後のリアリズムに及ぼす影響はどの程度か。
RQ4物理-informed の制約は再構成の安定性と、質量・エネルギー・運動量などの導出量の一貫性にどう影響するか。
RQ5提案モデルは半難治性 t tbar トポロジーを超えた高次元逆問題へのスケーリングにどの程度適応できるか。

主な発見

Wasserstein	Energy	K-S	KL_64	KL_128	KL_256
VLD	108.76	7.59	4.08	3.47	3.74	4.53
UC-VLD	73.56	6.35	3.41	5.77	7.10	8.48
C-VLD	389.62	25.39	4.65	9.54	10.09	10.79
LDM	402.32	24.09	5.91	14.71	16.34	17.92
VDM	2478.35	181.35	17.14	29.28	32.29	35.60
CVAE	484.56	32.29	6.37	7.79	9.17	10.60
CINN	3009.08	185.13	15.74	28.55	30.19	32.37

VLD モデルは距離指標全般で最高の性能を示し、UC-VLD および VLD がベースラインを上回った。
条件付きデコーダ変種（C-VLD、CVAE）はこの設定では再構成を悪化させる傾向があり、推論データには無条件デコーダの方が頑健であることを示唆している。
潜在拡散モデル（VLD/UC-VLD）は直接的な潜在アプローチ（CINN、VDM）を上回り、エンドツーエンド訓練は事前訓練済み LDM より改善を示す。
VLD からの事後サンプルは brute-force の事後より滑らかで、真のパートン配置に近く、二峰性ニュートリノ η 分布のような特徴を捉える。
物理-informed の整合性損失は安定性を高め、予測における質量・エネルギー・運動量の関係の整合性を揃える。
55 コンポーネントにわたり、総距離指標は VLD/UC-VLD がベースラインより低い距離を示し、全般的なグローバル分布忠実度が優れている。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。