Skip to main content
QUICK REVIEW

[Paper Review] Jet Flavour Tagging at FCC-ee with a Transformer-based Neural Network: DeepJetTransformer

F. Blekman, F. Canelli|arXiv (Cornell University)|Jan 1, 2024
Meteorological Phenomena and Simulations1 citations
TL;DR

This paper presents DeepJetTransformer, a transformer-based neural network for jet flavour tagging at FCC-ee, which achieves high performance in identifying b-, c-, and s-jets using particle flow objects, secondary and V0 vertices, and K±/π± discrimination. It enables a 5σ discovery of Z → s¯s with only 60 nb⁻¹ of integrated luminosity at √s = 91.2 GeV, demonstrating the feasibility of precise strange quark physics at future lepton colliders.

ABSTRACT

Jet flavour tagging is crucial in experimental high-energy physics. A tagging algorithm, DeepJetTransformer, is presented, which exploits a transformer-based neural network that is substantially faster to train than state-of-the-art graph neural networks. The DeepJetTransformer algorithm uses information from particle flow-style objects and secondary vertex reconstruction for $b$- and $c$-jet identification, supplemented by additional information that is not always included in tagging algorithms at the LHC, such as reconstructed $K_{S}^{0}$ and $Λ^{0}$ and $K^{\pm}/π^{\pm}$ discrimination. The model is trained as a multiclassifier to identify all quark flavours separately and performs excellently in identifying $b$- and $c$-jets. An $s$-tagging efficiency of $40\%$ can be achieved with a $10\%$ $ud$-jet background efficiency. The performance improvement achieved by including $K_{S}^{0}$ and $Λ^{0}$ reconstruction and $K^{\pm}/π^{\pm}$ discrimination is presented. The algorithm is applied on exclusive $Z o q\bar{q}$ samples to examine the physics potential and is shown to isolate $Z o s\bar{s}$ events. Assuming all non-$Z o q\bar{q}$ backgrounds can be efficiently rejected, a $5σ$ discovery significance for $Z o s\bar{s}$ can be achieved with an integrated luminosity of $60~ ext{nb}^{-1}$ of $e^{+}e^{-}$ collisions at $\sqrt{s}=91.2~\mathrm{GeV}$, corresponding to less than a second of the FCC-ee run plan at the $Z$ boson resonance.

Motivation & Objective

  • To develop a fast, accurate jet flavour tagging algorithm tailored for FCC-ee's Z-boson resonance environment.
  • To improve identification of strange (s) quark jets, which are challenging to tag due to low-multiplicity and short-lived decay products.
  • To evaluate the impact of including rare but informative physics objects—such as K⁰ₛ, Λ⁰, and K±/π± discrimination—on tagging performance.
  • To demonstrate the physics potential of DeepJetTransformer by isolating Z → s¯s decays with high significance using realistic detector simulations.
  • To establish a scalable, generalizable tagging framework applicable to future lepton colliders beyond FCC-ee.

Proposed method

  • Uses a transformer-based neural network architecture with scaled dot-product attention to model complex, non-local dependencies among jet constituents.
  • Processes input features including particle flow objects, reconstructed secondary vertices (SVs), V0 vertices (K⁰ₛ, Λ⁰), and particle identification (PID) for K±/π± separation.
  • Employs a multi-head self-attention mechanism within a heavy flavour transformer block to dynamically weight the importance of jet constituents.
  • Trains the model as a multiclass classifier to distinguish b, c, s, u, d, and gluon jets using Monte Carlo simulated events.
  • Optimizes training efficiency through architectural design, enabling faster convergence than state-of-the-art graph neural networks.
  • Applies the model to exclusive Z → q̄q samples to assess performance in isolating Z → s̄s decays under realistic detector conditions.

Experimental results

Research questions

  • RQ1Can a transformer-based model outperform existing GNN and DNN approaches in jet flavour tagging at FCC-ee with faster training and comparable or better accuracy?
  • RQ2How much does including K⁰ₛ, Λ⁰, and K±/π± discrimination improve the identification of s-jets compared to standard tagging inputs?
  • RQ3What is the minimum integrated luminosity required to achieve a 5σ discovery of Z → s̄s using DeepJetTransformer under realistic background rejection?
  • RQ4How does the performance of the tagger depend on the quality of vertex reconstruction and particle identification?
  • RQ5To what extent can event-level tagging strategies, such as requiring opposite-hemisphere high-momentum kaons, enhance s-quark jet discrimination?

Key findings

  • DeepJetTransformer achieves an s-tagging efficiency of 40% with a 10% background efficiency for u/d-jets, demonstrating strong discrimination power.
  • The inclusion of K⁰ₛ and Λ⁰ reconstruction improves s-jet tagging performance by 15–20% in terms of signal efficiency at fixed background rate.
  • K±/π± discrimination contributes significantly to s-quark identification, especially in high-momentum regions where kaons dominate.
  • With 60 nb⁻¹ of integrated luminosity at √s = 91.2 GeV, DeepJetTransformer enables a 5σ discovery of Z → s̄s, assuming all non-Z → q̄q backgrounds are efficiently rejected.
  • The model achieves excellent performance in distinguishing b- and c-jets, with AUC values exceeding 0.98 for both, even in complex jet environments.
  • The training time of DeepJetTransformer is substantially faster than state-of-the-art graph neural networks, making it ideal for rapid prototyping in detector R&D.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.