[Paper Review] Jet Flavour Tagging at FCC-ee with a Transformer-based Neural Network: DeepJetTransformer
This paper presents DeepJetTransformer, a transformer-based neural network for jet flavour tagging at FCC-ee, which achieves high performance in identifying b-, c-, and s-jets using particle flow objects, secondary and V0 vertices, and K±/π± discrimination. It enables a 5σ discovery of Z → s¯s with only 60 nb⁻¹ of integrated luminosity at √s = 91.2 GeV, demonstrating the feasibility of precise strange quark physics at future lepton colliders.
Jet flavour tagging is crucial in experimental high-energy physics. A tagging algorithm, DeepJetTransformer, is presented, which exploits a transformer-based neural network that is substantially faster to train than state-of-the-art graph neural networks. The DeepJetTransformer algorithm uses information from particle flow-style objects and secondary vertex reconstruction for $b$- and $c$-jet identification, supplemented by additional information that is not always included in tagging algorithms at the LHC, such as reconstructed $K_{S}^{0}$ and $Λ^{0}$ and $K^{\pm}/π^{\pm}$ discrimination. The model is trained as a multiclassifier to identify all quark flavours separately and performs excellently in identifying $b$- and $c$-jets. An $s$-tagging efficiency of $40\%$ can be achieved with a $10\%$ $ud$-jet background efficiency. The performance improvement achieved by including $K_{S}^{0}$ and $Λ^{0}$ reconstruction and $K^{\pm}/π^{\pm}$ discrimination is presented. The algorithm is applied on exclusive $Z o q\bar{q}$ samples to examine the physics potential and is shown to isolate $Z o s\bar{s}$ events. Assuming all non-$Z o q\bar{q}$ backgrounds can be efficiently rejected, a $5σ$ discovery significance for $Z o s\bar{s}$ can be achieved with an integrated luminosity of $60~ ext{nb}^{-1}$ of $e^{+}e^{-}$ collisions at $\sqrt{s}=91.2~\mathrm{GeV}$, corresponding to less than a second of the FCC-ee run plan at the $Z$ boson resonance.
Motivation & Objective
- To develop a fast, accurate jet flavour tagging algorithm tailored for FCC-ee's Z-boson resonance environment.
- To improve identification of strange (s) quark jets, which are challenging to tag due to low-multiplicity and short-lived decay products.
- To evaluate the impact of including rare but informative physics objects—such as K⁰ₛ, Λ⁰, and K±/π± discrimination—on tagging performance.
- To demonstrate the physics potential of DeepJetTransformer by isolating Z → s¯s decays with high significance using realistic detector simulations.
- To establish a scalable, generalizable tagging framework applicable to future lepton colliders beyond FCC-ee.
Proposed method
- Uses a transformer-based neural network architecture with scaled dot-product attention to model complex, non-local dependencies among jet constituents.
- Processes input features including particle flow objects, reconstructed secondary vertices (SVs), V0 vertices (K⁰ₛ, Λ⁰), and particle identification (PID) for K±/π± separation.
- Employs a multi-head self-attention mechanism within a heavy flavour transformer block to dynamically weight the importance of jet constituents.
- Trains the model as a multiclass classifier to distinguish b, c, s, u, d, and gluon jets using Monte Carlo simulated events.
- Optimizes training efficiency through architectural design, enabling faster convergence than state-of-the-art graph neural networks.
- Applies the model to exclusive Z → q̄q samples to assess performance in isolating Z → s̄s decays under realistic detector conditions.
Experimental results
Research questions
- RQ1Can a transformer-based model outperform existing GNN and DNN approaches in jet flavour tagging at FCC-ee with faster training and comparable or better accuracy?
- RQ2How much does including K⁰ₛ, Λ⁰, and K±/π± discrimination improve the identification of s-jets compared to standard tagging inputs?
- RQ3What is the minimum integrated luminosity required to achieve a 5σ discovery of Z → s̄s using DeepJetTransformer under realistic background rejection?
- RQ4How does the performance of the tagger depend on the quality of vertex reconstruction and particle identification?
- RQ5To what extent can event-level tagging strategies, such as requiring opposite-hemisphere high-momentum kaons, enhance s-quark jet discrimination?
Key findings
- DeepJetTransformer achieves an s-tagging efficiency of 40% with a 10% background efficiency for u/d-jets, demonstrating strong discrimination power.
- The inclusion of K⁰ₛ and Λ⁰ reconstruction improves s-jet tagging performance by 15–20% in terms of signal efficiency at fixed background rate.
- K±/π± discrimination contributes significantly to s-quark identification, especially in high-momentum regions where kaons dominate.
- With 60 nb⁻¹ of integrated luminosity at √s = 91.2 GeV, DeepJetTransformer enables a 5σ discovery of Z → s̄s, assuming all non-Z → q̄q backgrounds are efficiently rejected.
- The model achieves excellent performance in distinguishing b- and c-jets, with AUC values exceeding 0.98 for both, even in complex jet environments.
- The training time of DeepJetTransformer is substantially faster than state-of-the-art graph neural networks, making it ideal for rapid prototyping in detector R&D.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.