[论文解读] You Only Need Two Detectors to Achieve Multi-Modal 3D Multi-Object Tracking
提出一个端到端的多模态3D MOT框架,该框架仅使用2D检测器和3D检测器进行联合检测与跟踪,省略数据关联并提升鲁棒性。
In the classical tracking-by-detection (TBD) paradigm, detection and tracking are separately and sequentially conducted, and data association must be properly performed to achieve satisfactory tracking performance. In this paper, a new end-to-end multi-object tracking framework is proposed, which integrates object detection and multi-object tracking into a single model. The proposed tracking framework eliminates the complex data association process in the classical TBD paradigm, and requires no additional training. Secondly, the regression confidence of historical trajectories is investigated, and the possible states of a trajectory (weak object or strong object) in the current frame are predicted. Then, a confidence fusion module is designed to guide non-maximum suppression for trajectories and detections to achieve ordered and robust tracking. Thirdly, by integrating historical trajectory features, the regression performance of the detector is enhanced, which better reflects the occlusion and disappearance patterns of objects in real world. Lastly, extensive experiments are conducted on the commonly used KITTI and Waymo datasets. The results show that the proposed framework can achieve robust tracking by using only a 2D detector and a 3D detector, and it is proven more accurate than many of the state-of-the-art TBD-based multi-modal tracking methods. The source codes of the proposed method are available at https://github.com/wangxiyang2022/YONTD-MOT.
研究动机与目标
- 通过避免显式数据关联来简化多模态3D MOT的动机。
- 开发一个将检测与跟踪融入单一模型的端到端框架。
- 研究历史轨迹的回归置信度以在当前帧预测对象状态。
- 利用历史轨迹特征增强检测器回归以反映遮挡和消失模式。
提出的方法
- 将目标检测与多目标跟踪整合到一个端到端模型中。
- 消除传统的基于检测的跟踪(TBD)的数据关联步骤。
- 引入置信度融合模块以引导轨迹和检测的非极大值抑制。
- 基于历史回归置信度预测当前帧可能的轨迹状态(弱/强)。
- 结合历史轨迹特征以改善检测器回归并处理遮挡。
- 在KITTI和Waymo上评估以证明仅需2D检测器和3D检测器即可实现鲁棒性。
实验结果
研究问题
- RQ1端到端的联合检测与跟踪是否可以在多模态3D MOT中省略复杂的数据关联?
- RQ2历史轨迹的回归置信度如何影响当前帧状态的预测与抑制决策?
- RQ3引入历史轨迹特征是否在遮挡和消失模式下改善检测器回归?
主要发现
- 该框架仅使用2D检测器和3D检测器就实现了鲁棒的多模态3D MOT。
- 相比许多基于TBD的多模态跟踪方法具有更强的性能(基于报道的结论)。
- 置信度融合模块引导非极大值抑制,产生有序且鲁棒的跟踪结果。
- 历史轨迹特征提升检测器的回归性能,更好地反映现实世界的遮挡与消失模式。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。