QUICK REVIEW

[论文解读] TrackMPNN: A Message Passing Graph Neural Architecture for Multi-Object Tracking

Akshay Rangesh, Pranav Maheshwari|arXiv (Cornell University)|Jan 11, 2021

Video Surveillance and Tracking Methods参考文献 60被引用 25

一句话总结

TrackMPNN 提出了一种使用动态无向图和在滑动窗口上运行的消息传递神经网络进行在线多目标跟踪的方法，在仅使用 2D 框位置和类别 ID 的情况下仍实现具竞争力的结果。

ABSTRACT

This study follows many classical approaches to multi-object tracking (MOT) that model the problem using dynamic graphical data structures, and adapts this formulation to make it amenable to modern neural networks. Our main contributions in this work are the creation of a framework based on dynamic undirected graphs that represent the data association problem over multiple timesteps, and a message passing graph neural network (MPNN) that operates on these graphs to produce the desired likelihood for every association therein. We also provide solutions and propositions for the computational problems that need to be addressed to create a memory-efficient, real-time, online algorithm that can reason over multiple timesteps, correct previous mistakes, update beliefs, and handle missed/false detections. To demonstrate the efficacy of our approach, we only use the 2D box location and object category ID to construct the descriptor for each object instance. Despite this, our model performs on par with state-of-the-art approaches that make use of additional sensors, as well as multiple hand-crafted and/or learned features. This illustrates that given the right problem formulation and model design, raw bounding boxes (and their kinematics) from any off-the-shelf detector are sufficient to achieve competitive tracking results on challenging MOT benchmarks.

研究动机与目标

将多目标跟踪建模为随时间演化的动态图上的推理。
实现跨越多个时间步的在线、实时推理，并具备内存高效的更新。
表明原始的 2D 边界框和类别 ID 就足以实现有竞争力的 MOT 性能。
展示一个可扩展的训练/推理框架，具有滑动窗口图更新和解码。
在标准基准（例如 KITTI MOT）上与最先进的 MOT 方法进行比较。

提出的方法

将检测表示为节点，将潜在关联表示为边，构建一个动态的、滑动窗口的图。
使用一个无向的二部图，包含检测节点和关联节点，随着新帧到来而演化。
应用一个专门的 TrackMPNN，具有分离的检测节点更新和关联节点更新，并包含基于注意力的消息传递。
以覆盖检测、关联和竞争边任务的组合损失进行训练；使用迷你序列以获得可控的内存。
通过贪婪解码或匈牙利匹配来解码轨迹；裁剪图以控制内存和计算。

实验结果

研究问题

RQ1在动态、滑动图上运行的图神经网络是否能超越依赖手工设计代价和特征的传统 MOT 流水线？
RQ2仅使用 2D 框位置和类别 ID 就足以实现 MOT 的有效数据关联吗？
RQ3滑动窗口图更新和内存管理如何影响在线跟踪的性能和稳定性？

主要发现

TrackMPNN 的在线 GNN 框架在标准基准上实现了有竞争力的 MOT 性能，同时仅将 2D 框位置和类别 ID 作为特征。
使用带有动态图更新的滑动窗口实现跨多个时间步的在线推理，并支持纠正过去的错误。
基于注意力的消息传递以及基于差分或连接的关联更新会影响跟踪度量，其中基于差分的更新显示出较有利的结果。
训练过程中的数据增强可以提升大多数 MOT 指标，尤其在较小的数据集上。
使用匈牙利算法进行解码可提高轨迹连续性，但需要额外的计算成本。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。