QUICK REVIEW

[论文解读] A robust anomaly finder based on autoencoders

Tuhin S. Roy, Aravind H. Vijay|arXiv (Cornell University)|Mar 5, 2019

Particle physics theoretical and experimental studies参考文献 80被引用 67

一句话总结

本期论文提出一种通过自编码器结合新颖喷气预处理的稳健抗 QCD 喷气标记器，在同一坐标系中固定喷气质量与能量，从而实现跨不同相空间区域的质量鲁棒异常检测。

ABSTRACT

We propose a robust method to identify anomalous jets by vetoing QCD-jets. The robustness of this method ensures that the distribution of the proposed discriminating variable (which allows us to veto QCD-jets) remains unaffected by the phase space of QCD-jets, even if they were different from the region on which the model was trained. This suggests that our method can be used to look for anomalous jets in high m/p T bins by simply training on jets from low m/p T bins, where sufficient background-enriched data is available. The robustness follows from combining an autoencoder with a novel way of pre-processing jets. We use momentum rescaling followed by a Lorentz boost to find the frame of reference where any given jet is characterized by predetermined mass and energy. In this frame we generate jet images by constructing a set of orthonormal basis vectors using the Gram-Schmidt method to span the plane transverse to the jet axis. Due to our preprocessing, the autoencoder loss function does not depend on the initial jet mass, momentum or orientation while still offering remarkable performance. We also explore the application of this loss function combined (using supervised learning techniques like boosted decision trees) with few other jet observables like the mass and Nsubjettiness for the purpose of top tagging. This exercise shows that our method performs almost as well as existing top taggers which use a large amount of physics information associated with top decays while also reinforcing the fact that the loss function is mostly independent of the additional jet observables.

研究动机与目标

激发对喷气数据进行自下而上的异常搜索，以寻找超越标准模型的物理现象。
开发一种基于自编码器的异常发现方法，使其对 QCD 喷气相空间的变化保持鲁棒。
展示定制的喷气预处理使自编码器的损失在很大程度上与喷气质量和动量无关。

提出的方法

喷气通过重缩放到固定质量 m0 并洛伦兹变换到固定能量 E0 进行预处理。
在横截面平面使用 Gram-Schmidt 正交基构造喷气图像，以去除旋转对称性。
训练全连接和卷积自编码器以重构喷气图像，并将重构损失作为异常分数。
使用 SoftMax 最后一层以保留能量归一化，并计算输入与重构之间的 L2 距离作为损失。
通过在一个 rho 区间上训练、在其他区间测试，在 13 TeV 和 100 TeV 的仿真中（有无探测器效应）对鲁棒性进行基准测试。
可选地将自编码器损失与喷气观测量（如质量和 N-subjettiness）结合用于顶标记。

实验结果

研究问题

RQ1在一个动量区间上训练的 QCD 喷气自编码器是否能够有效排除不同区间中的 QCD 喷气，而不改变喷气质量分布的形状？
RQ2所提出的喷气预处理（质量/能量固定和基于 Gram-Schmidt 的横向基）是否使自编码器的损失与喷气质量和 rho（mJ/pT R）解耦？
RQ3密集型与卷积自编码器在鲁棒性以及顶-和 W-喷气的异常识别方面的表现有何差异？
RQ4单独使用或结合喷气观测量时，该方法是否与基于物理的标记器具竞争力？
RQ5该方法是否能够识别来自新物理情景（如喷气内的二 W 强衰变）的异常？

主要发现

预处理带来显著的鲁棒性：自编码器损失在跨 rho 区间时对喷气质量的依赖很小。
卷积自编码器在所测试的架构中提供了最鲁棒的性能。
该方法单独使用时的异常标记性能与现有方法相当，结合 N-subjettiness 进行顶标记时显著提升。
在低 rho 的 QCD 喷气上进行训练，使在高 rho 区域的异常搜索更有效，即使背景纯度有限。
与喷气质量或 Nsubjettiness 结合时，在某些配置下接近或超越专用顶标记器。
在探测器效应和 MPI 变化下方法仍然有效，保持损失函数在各情景中的鲁棒性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。