[论文解读] TBC-Net: A real-time detector for infrared small target detection using semantic constraint
TBC-Net 将轻量级目标提取模块与语义约束模块结合,在单帧中改进红外小目标检测,使用联合损失和数据合成以在嵌入式硬件上实现实时性能。
Infrared small target detection is a key technique in infrared search and tracking (IRST) systems. Although deep learning has been widely used in the vision tasks of visible light images recently, it is rarely used in infrared small target detection due to the difficulty in learning small target features. In this paper, we propose a novel lightweight convolutional neural network TBC-Net for infrared small target detection. The TBCNet consists of a target extraction module (TEM) and a semantic constraint module (SCM), which are used to extract small targets from infrared images and to classify the extracted target images during the training, respectively. Meanwhile, we propose a joint loss function and a training method. The SCM imposes a semantic constraint on TEM by combining the high-level classification task and solve the problem of the difficulty to learn features caused by class imbalance problem. During the training, the targets are extracted from the input image and then be classified by SCM. During the inference, only the TEM is used to detect the small targets. We also propose a data synthesis method to generate training data. The experimental results show that compared with the traditional methods, TBC-Net can better reduce the false alarm caused by complicated background, the proposed network structure and joint loss have a significant improvement on small target feature learning. Besides, TBC-Net can achieve real-time detection on the NVIDIA Jetson AGX Xavier development board, which is suitable for applications such as field research with drones equipped with infrared sensors.
研究动机与目标
- 在背景杂乱且目标极小(2–10 像素)的情况下,推动鲁棒的红外小目标检测。
- 开发一个在极端类别不平衡下学习小目标特征的轻量级网络。
- 通过一个高级别分类任务引入语义约束,以指导目标特征学习。
- 在嵌入式硬件(NVIDIA Jetson Xavier)上实现实时推理。
- 提出数据合成和联合损失以提升学习能力和鲁棒性。
提出的方法
- 提出一个两模块架构:Target Extraction Module(TEM)与 Semantic Constraint Module(SCM)。
- TEM 是一个带下采样/上采样和残差连接的编码-解码器,设计用于单通道红外目标。
- SCM 是一个多层卷积神经网络,训练用于按目标数量对 TEM 输出进行分类,在 TEM 训练期间提供语义约束。
- 使用联合损失 L_TBC = L_T + L_B + lambda L_C 来训练 TEM,并得到 SCM 的引导。
- 数据合成将小目标样本融合到背景图像中,创建带有指示目标数量标签的训练元组 (f_D, f_T, y_T)。
- 在推理阶段,仅使用 TEM,从而在嵌入式硬件上实现实时检测。
实验结果
研究问题
- RQ1通过 SCM 引入语义约束能否在严重类别不平衡的情况下改进小目标特征的学习?
- RQ2TEM+SCM 训练方案是否比传统的单帧红外检测方法在目标提取方面更好、误报警更低?
- RQ3TBC-Net 是否能在如 NVIDIA Jetson Xavier 等边缘设备上实现实时红外小目标检测?
- RQ4合成数据在训练模型以适应真实红外序列方面有多有效?
主要发现
- 与传统方法相比,TBC-Net 在杂乱的红外背景中减少了误报警。
- 尽管存在类别不平衡,联合的 TEM/SCM 损失仍提升了对小目标特征的学习。
- 该模型在 NVIDIA Jetson AGX Xavier 上实现了实时性能(256×256 输入)。
- 每张图像融合 1–3 个目标使 SCM 能以高准确性进行有效训练。
- SCM 对合成目标的分类准确度约为 97.5%,用于指导 TEM 的训练。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。